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(54) Predictive modeling of customer financial behavior 



(57) Predictive modeling of consumer financial 
behavior is provided by application of consumer trans- 
action data to predictive models associated with mer- 
chant segments. The merchant segments are derived 
from to consumer transaction data based on co-occur- 
rences of merchants in sequences of transactions. Mer- 
chant vectors represent specific merchants, and are 
aligned in a vector space as a function of to degree to 
which the merchants co-occur more or less frequently 
than expected. Merchant vectors are clustered to form 
the merchant segments. Analysis of merchant seg- 
ments details transaction rates, volumes and amounts 
for the segment and its individual merchants. For each 
merchant segment a predictive model is trained using 
consumer transaction data in selected past time periods 
to predict spending in subsequent time periods. The 
merchant segment predictive models provide predic- 
tions of spending in each merchant segment for any 
particular consumer, based on previous spending by the 
consumer. Consumer profiles describe summary statis- 
tics of each consumer's spending in the merchant seg- 
ments, and across merchant segments. The consumer 
profiles include consumer vectors derived as summary 
vectors of selected merchants patronized t>y the con- 
sumer. Membership functions associate each consumer 
with one or more merchant segments. Analysis of the 
consumers associated with a segment allows for identi- 



fication of selected consumers according to predicted 
spending in the segment or other criteria, and the tar- 
geting of promotional offers specific to the segment and 
its merchants. 
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Description 

BACKGROUND * ^ 

Field of Invention 

[00011 The present invention relates generally to analysis o1 consumer financial behavior, and more particularly to 
analyzing historical consumer financial behavior to accurately predict future spending behavior, and more particularly, 
future spending in specifically identified data-driven industry segments. 

Backoroun d of Invention 

[0002] Retailers, advertisers, and many other institutions are keenly Interested in understanding consumer spend- 
ing habits These companies invest tremendous resources to identify and categorize consumer interests, in order to 
learn how consumers spend money. If the interests of an indwidual consumer can be determined, then it is believed that 
advertising and promotions related to these interests will be more successful in obtaining a positive consumer 
response, such as purchases of the advertised products or services. 

[0003] Conventional means of determining consumer interests have generally relied on collecting demographic 
information about consumers, such as income, age. place of residence, occupation, and so forth, and associating var- 
ious demographic categories with various categories of interests and merchants. Interest information may be collected 
from surveys publication subscription lists, product wananty cards, and myriad other sources. Complex data process- 
ing is then applied to the source of data resulting in some demographic and interest description of each of a number of 

consumers. ... . * 

[0004] This approach to understanding consumer behavior often misses the marie. The ultimate goal of this type oT 
approach whether acknowledged or not. Is to predict consumer spending in the future. The assumption is that consum- 
ers will spend money on their interests, as expressed by things like their subscription lists and their demographics. Yet. 
the data on which the determination of interests Is made is typically only indirectly related to the actual spending pat- 
terns of the consumer. For example, most publications have developed demographic models of their readership, and 
offer their subscription lists for sale to others interested in the particular demographics of the publication's readers. But 
subscription to a particular publication is a relatively poor Indicator of what the consumer's spending patterns will be in 

the future. . - .- * 

[00051 Even taking into account multiple different sources of data, such as combining subscription lists, wananty 

registration cards, and so forth still only yields an incomplete collection of unrelated data about a consumer. 

[0006] One of the problems in these conventional approaches is that spending patterns are time based. That is. 

consumers spend money at merchants which are of Interest to them In typically a time related manner For example, a 

consumer who is a business traveler spends money on plane tickets, car rentals, hotel accommodations, restaurants, 

and entertainment all duririg a single business trip. These purchases together more strongly describe the consumer's 

true interests and preferences than any single one of the purchases alone. Yet conventional approaches to consumer 

analysis typically treats these purchases individually and as unrelated in time. 

[0007] Yet another problem with conventional approaches is that categorization of purchases is often based on 
standardized industry classifications of merchants and business, such as the SIC codes. This set of classification is 
entirely arbitrary, and has little to do with actual consumer behavior. Consumer do not decide which merchants to pur- 
chase from based on their SIC code. Thus, the use of arbitrary classifications to predict financial behavior is doomed to 
failure, since the classifications have Iittie meaning in the actual data of consumer spending. 

[0008] A third problem is that different groups of consumers spend money in different ways. For example, consum- 
ers who frequent high-end retailers have entirely different spending habits than consumers who are bargain shoppers. 
To deal with this problem, most systems focus exclusively on very specific, predefined types of consumers, in effect, 
assuming that the interests or types of consumers are known, and targeting these consumers with what are believed to 
be advertisements or promotions of interest to them. However, this approach essentially puts the cart before the pro- 
verbial horse: it assumes the interests and spending pattems of a particular group of consumers, it does not discover 
them from actual spending data. It thus begs the questions as to whether the assumed group of consumers in fact even 
exists, or has the interest that are assumed for it. • i 

[0009] Accordingly, what is needed Is the ability to model consumer financial behavior based on actual historical 
spending patterns that reflect the time-related nature of each consumer's purchase. Further. It is desirable to extract 
meaningful classifications of merchants based on the actual spending patterns, and from the combination of these, pre- 
dict future spending of an individual consumer in specific, meaningful merchant groupings. 

[001 0] In the application domain of information, and particularty text retrieval, vector based representations of doc- 
uments and words is known. Vector space representations of documents are described in U.S. Pat. No. 5.619.709 
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25 



. M r«.H -t ai and In U S Pat No 5.32S.298 issued to Gallant. Generally, vectors are used to represent words 
Issued to Cald et el, and in u.&. r-at ^^^^^ M^,man*s is learned and encoded n the vectors by a 

or documents. The relationships between words and between documente te ^^"^^^^^^^^ ^^ctors of Cald. are 

learning law. However, because these uses of vector «P^^« ™P?^f ^^^^ ^behavior when applied to 

designed for primarily for information retneval. ^ applied to the prediction 

, documents such as credit card mertt^Bnts. These 

problems. It had numerous shortcomings^ S2ul„t^ inCcSfeS^rof^^^^^ statements. Because Caid-s sys- 
are ,ne.h«,ts -^Jf-r-^^;;-^:^^^^^ frequency me^hants were not 

,0 thesystem-sabill^to predi«tm^art.o^^^^^ ^^^^^ ,3^„,„g ,3 

15 transaction data. 

SUMMARY OF THE INVEI^TTION 

moiH The oresent invention overcomes the limitations of conventional approaches to consumer analysis by pro- 

[o^^T'''p^:;::^b'y.eachconsu 
bbjhSrc-^^^^^^^ 

?Sr'you are Whom you shop at." since the vectors of the merchants are used to construct the vectors of the con- 

JmTbT" An advantage of this approach is that both consumers and merchants are represented in a common vertor 
S This laT«^t?gi^en a co'n sumer vector, the merchant vectors which are -similar- to this ^^^^^^^l^^^^J^^ 
br^adTJ deter^^ed (L Is they point in generally the same direction In the '"^f ^^J^^' ^t^^l 

using dot product analysis. Thus, merchants who are 'similar- to the consumer can be eas.V detem^ined. these be.ng 
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merchants who would Okety be of interest to the consumer, even if the consumer has never purchased from these mer- 
chants before. 

[0016] Given the merchant segments, the present Invention then creates a predictive model of future spending in 
each merchant segment, based on transaction statistics of historical spending in the merchant segment by those con- 

5 sumers who have purchsised from merchants in the segments, in other segments, and data on overall purchases. In 
one embodiment, each predictive model predicts spending in a merchant cluster in a predicted time interval, such as 3 
months, based on historical spending in the cluster in a priortlme interval, such as the previous 6 months. During model 
training, the historical transactions in the merchant cluster for consumers who spent In the cluster, is summarized in 
each consumer's profile in summary statistics, and Input into the predictive model along with actual spending in a pre- 

10 dieted time interval. Validation of the predicted spending with actual spending is used to confirm model performance. 
The predictive models may be a neural networks, or other multivariate statistical model. 

[0017] This modeling approach is advantageous for two reasons. First, the predictive models are specific to mer- 
chant clusters that actually appear in the underlying spending data, instead offer arbitrary classifications of merchants 
such as SIC classes. Second, because the consumer spending data of those consumers who actually purchased at the 
IS merchants in the merchant clusters is used, they most accurately reflect how these consumer have spent and will spend 
at these merchants. 

[0018] To predict financial behavior, the consumer profile of a consumer, using preferably the same type of suno- 
mary statistics for a recent, past time period, is input into the predictive models for the different merchant clusters. The 
result is a prediction of the amount of money that the consumer is likely to spend in each merchant cluster in a future 

20 time interval, for which no actual spending data may yet be available. 

[0019] For each consumer, a membership function may be defined which describes how strongly the consumer is 
associated with each merchant segment (Preferably, the memt>ersh1p function outputs a membership value for each 
merchant segment.) The membership function may be the predicted future spending in each merchant segment, or it 
may be a function of the consumer vector for the cor\sumer and a merchant segment vector (e.g. centroid of each mer- 

25 chant segment). The membership function can be weighted by the amount spent by the consumer in each merchant 
segment or other factors. Given the membership function, the merchant clusters for which the consumer has the high- 
est membership values are of particular interest: they are the clusters in which the consumer wilt spend the most money 
in the future, or whose spending heibits are most similar to the merchants In the cluster. This allows very specific and 
accurate targeting of promotions, advertising and the like to these consumers. A financial institution using the predicted 

30 spending information can direct promotional offers to consumers who are predicted to spend heavily in a merchant seg- 
ment, with the promotional offers associated with merchants in the merchant segment 

[0020] Also, given the membership values, changes in the membership values can be readily determined over time, 
to identify transitions by the consumer between merchants segments of interest For example, each month (e.g. after a 
new credit card billing period or bank statement), the membership function is determined for a consumer, resulting in a 

35 new membership value for each merchant cluster. The new membership values can be compared with the previous 
month's membership values to indicate the largest positive and negative increaises. revealing the consumer's changing 
purchasing habits. Positive changes reflect purchasing interests in new merchant clusters; negative changes reflect the 
consumer's lack of interest in a merchant cluster in the past month. Segment transitions such as these further enable 
a financial institution to target consumers with promotions for merchants in the segments in which the consumers show 

40 significant increases in membership values. 

[0021] In another aspect, the present invention provides an improved methodology for learning the relationships 
between merchants in transaction data, and defining vectors which represent the merchants. More particularly, this 
aspect of the invention accurately identifies and captures the patterns of spending behavior which result in the co- 
occurrence of transactions at different merchants. The methodology is generally as follows: 

45 [0022] First the number of times that each pair of merchants co-occur with one another in the transaction data is 
determined. The underlying intuition here is that merchants whom the consumers* behaviors indicates as being related 
will occur together often, whereas unrelated merchants do not occur together often. For example, a new mother will 
likely shop at children's clothes stores, toy stores, and other similar merchants, whereas a single young mate will likely 
not shop at these types of merchants. The identification of merchants is by counting occurrences of merchants' names 

so in the transaction data. The merchants' names may be nonmatized to reduce variations and equate different versions of 
a merchants name to a single common name. 

[0023] Next, a relationship strength between each pair of merchants is determined based on how much the 
observed co-occurrence of the merchants deviated from an expected co-occurrence of the merchant pair. The expected 
co-occurrence is based on statistical measures of how frequentiy the individual merchants appear in the transaction 
55 data or in co-occun-ence events. Various relationship strength measures may be used, based on for example, standard 
deviations of predicted co-occurrence, or log-likelihood ratios. 

[0024] The relationship strength measure has the features that two merchants that co-occur significantly more 
often than expected are positively related to one another; two merchants that co-occur significantly less often than 
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■ expected are negatively related to one another, and two merchants that co-occur about the number of times expected 

MM5l' ''Te relationship strength between each pair of merchants is then mapped into the vector space. This is 
SL?e bv delTi^ Snrre desired dot product between each pair of merchant vectors as a function of the relationsh^ 
, 22ncSj2eS3hepalrofmerchants.Thiss^^^ 

LefpS doi pro'd^^^^ '"e-^'^-t vector? for negatively related merchants have a negative dot product, and the 

LTupdated ^ y r^^^^^^ products betwLn them at least closely approximate the desired dot products prev.ous^, 
10 Jetermined ^^^^ detemiinlng whether any two strings represent the same 

S-^ such^vJJ^nt^peTnS a meichant name. This aspect of the Invention Is beneficially used to Identrfy and nor- 
thing, such as vanant spemnap t^-.j^nv « variety of different spellings or forms of a same merchant name m large 

Secto^Vre d^ined for each trigram. Each string (e.g. me«=hant name) to be ^o-P^-'^^f « ^S-hte-^. 
unii vcui« ie HAfinPd as the sum ot the unit vectors for each trigram in the stnng. weigntea oy tne 

f^iihT^nyiJo X suc?"a: m^^^^ can now be compared by taWng their dot product. If th^dot 

or^TtTs Sovra tJ^sholdUtermined from analysis of the data set), then the strings are deemed to equ^jente 
S?oth?r No^alizingthe length of the string vectors maybe used to make the comparison insensitive to the leng* 
Tthe oSnal Sy^h eithef partial (nom^alteatlon of one string but not the other) or 

l^h Sencfs the comparison but may be used to match parts of one string against the entirety of another stnng. 
™?mCodl^ o^^^^^ extremely fast and accurate mechanism for string matching. The matching process 

Z be sedtre^ine to^^ whether two merchant names are the same, two --P«7 ""Jf ' 

names or the like This is useful in applications heeding to reconcile divergent sources or types of data containing 
"r"gs whirrrren?^^^^^^ common gmup of entWes (e.g. transaction records from many transac^on sources contain- 

^ 'roMsT" ?hroTesTnnnventlon may be embodied in various fomns. As a computer program product, the present 

in databases A profiling engine applies consumer profiles and consumer transaction data to the predictive '"ooe s ^ 
provte pidicteJ r^^^^^^^^^ I eacf merchant segment, and to compute membership functions of the 
me^hantleg^^^^^^^ engine outputs reports in various fomiats regarding the predicted spending and mem- 

Sp i forS^ transition Lection engine computes changes in each consumer's -«-''e-J^^^^^^ 

?o TdentrsSnificant transitions of the consumer between merchant clusters^ The P^^^"* j;;^*"^;^^^^^ 
embodied as a system. Wrth the above program product element cooperating wrth computer hardware components, and 
as a computer implemented method. 
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DESCRIPTION OF THE DRAWINGS 



[0029] 

Figs, la-ic are iUustrations ot merchant and consumer vector representations. 
50 Fig. 2 is a sample list of merchant segments. 

Fin 'x iQ a flowchart of the overall process of the present invention. 

F g 4^ is an rsfiStion o' the system architecture of one embodiment o, the present invention ^^"^"S «P^^^^^^ 
Fig. 4b Is an illustration of the system architecture of the present invention during development and training of mer- 
chant vectors, and merchant segment predictive models. «.«tom 
55 Fig 5 is an illustration of the functional components of the predictive model generation system. 
Fiqs 6a and 6b are illustrations of fonward and backward co-occunrence windows. 

Fig 7a is an illustration of the master file data prior to stemming and equivalencing. and _-d 
Fig 7b is an illustration of a fonvard co-occurrence window in this portion of the master file after stemming and 
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equivalencing. 

Flq 8 is an illustration of the various types of observations during model training. 

Rg. 9 is an niustration of the application of multiple consumer account data to the murtiple segment predictive mod- 
els. 

DETAILED DESCRIPTION OF THE PREFERRED EMBODIIV/iENTS 
[0030] 

A. Overview of Consumer and Merchant Vector Representation and the Co-occurrence of Merchant Purchases 

B. System Overview 

C. Functional Overview 

D. Data Preprocessing Module 

E. Predictive Model Generation System 

1 . Merchant Vector Generation 

2. Training of Merchant Vectors: The UDL Algorithm 

a) Co-occurrence Counting 

i) - Forward co-occurrence counting 

ii) Backward co-occurrence counting 

ill) Bi-directional co-occun^ence counting 

25 b) Estimating Expected Co-occurrence Counts 

c) Desired Dot-Products between Merchant Vectors 

d) Merchant Vector Training 
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3. Clustering Module 

F. Data Postprocessing Module 

G. Predictive Model Generation 

H. Profiling Engine 

35 1 .Membership Function: Predicted Spending In Each Segment 

2. Segment Membership Based on Consumer Vectors 
a.Updating of Consumer Profiles 



40 



I. Reporting Engine 

1 . Basic Reporting Functionality 

2. General Segment Report 

a) General Segment Information 
45 b) Segment Members information 

c) Lift Chart 

d) Population Statistics Tables 

t) Segment Statistics 
50 ii) Row Descriptions 

J. Targeting Engine 

K. Segment Transition Detection 



55 A. OVERVIEW OF CONSUMER ANH MgRCHAMT VFCTOR R FPRFSENTATION AND THE 00-OCOURF|ENCK OF 
MERCHANT PURCHASES 

[0031] One feature of the present invention that enables prediction of consumer spending levels at specific mer- 
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chants is the ability to represent both consumer and merchants in a same modeling representation. A conventional 
example is attempting to classify both consumers and merchants with demographic labels (e.g. 'baby boomers', or 
•empty-nesters'). This conventional approach is simply arbitrary, and does not provide any mechanisms tor directly 
quantifying how similar a consumer is to various merchants. The present Invention, however, does provide such a quan- 
s tifiable analysis, based on high-dimensional vector representations of both consumers and merchants, and the co- 
occurrence of merchants in the spending data of individual consumers. 

[0032] Referring now to Figs, la and lb. there is shown a simplified model of the vector space representation of 
merchants and consumers. The vector space 100 is shown here with only three axes, but in practice is a high dimen- 
sional hypersphere, typically having 1 00-300 components. In this vector space 1 00. each merchant is assigned a mer- 
10 chant vector. Preferably, the initial assignment of each merchanTs vector contains essentially randomly valued vector 
components, to provide for a quasi-orthogonal distribution of merchant vectors. This means that initially, the merchant 
vectors are essentially perpendicular to each other, so that there is no predetermined or assumed association or simi- 
larity between merchants. 

{0033] In Rg. la. there is shown merchant vectors for five merchants, A. B, C. D, and E after initialization, and prior 
IS to being updated. Merchant A is an upscale dothing store, merchant B is a discount furniture store, merchant C is an 
upscale furniture store, merchant D is a discount clothing catalog outiet. and merchant E is a online store for fashion 
jewelry. As shown in Rg. 1c. merchants A and D have the same SIC code because they are both clothing stores, and 
merchants B and C have the same SIC code because they are botii furniture stores. In other words, the SIC codes do 
not distinguish between the types of consumers who frequent these stores. 
20 [0034] In Rg. 1 b, there is shown the same vector space 1 00 after consumer spending data has been processed 
according to the present invention to train the merchant vectors. The training of merchant vectors is based on co-occur- 
rence of merchants in each consumer's transaction data. Rg. 1 c illustrates consumer transaction data 1 04 for two con- 
sumers. CI and C2. The transaction data for C1 includes transactions 1 10 at merchants A. C, and E. In this example, 
the transaction at merchants A and C co-occur within a co-occurrence window 108; likewise the transactions at mer- 
25 chants C and E co-occur within a separate co-occurrence window 1 08. The transaction data for C2 includes transac- 
tions 1 10 at merchants B and D, which also form a co-occurrence event 

[0035] Merchants for whom transactions co-occur in a consumer's spending data have their vectors updated to 
point more in the same direction in the vector space, that is making their respective vector component values more sim- 

30 [0036] Thus, in Rg. 1 b. following processing of the consumer transaction data, the merchant vectors for merchants 
A, C. and E have been updated, based on actual spending data, such as CVs transactions, to point generally in the 
same direction, as have the merchant vectors for merchants B and D. based on C2's transactions. Clustering tech- 
niques are used then to identify clusters or segments of merchants based on their merchant vectors 402. In the exam- 
ple of Rg. lb. a merchant segment is defined to include merchants A, C, and E, such as -upscale-technology_sawy.- 

35 Note that as defined above, the SIC codes of these merchants are entirely unrelated, and so SIC code analysis would 
not reveal this group of merchants. Further, a different segment with merchants B and D is identified, even tough the 
merchants share the same SIC codes with the merchants in the first segment, as shown in the transaction data 104. 
[0037] Each merchant segrrient is associated with a merchant segment vector 1 05. preferably the centroid of the 
merchant cluster. Based on the types of merchants in the merchant segment, and the consumers who have purchased 

40 in the segment, a segment name can be defined, and may express the industry, sub-industry, geography, and/or con- 
sumer demographics. 

[0038] The merchant segments provide very useful information about the consumers. In Rg. 1 b there is shown the 
consumer vectors 106 for consumers Ci and C2. Each consumer's vector is a summary vector of the merchants at 
which the consumer shops. This summary is preferably the vector sum of merchant vectors at which the consumer has 
45 shopped at in defined recent time interval. The vector sum can be weighted by the recency of the purchases, their dollar 
amount, or other factors. 

[0039] Being in the same vector space as the merchant vectors, the consumer vectors 1 06 reveal the consumer's 
interests in terms of their actual spending behavior. This information is by far a better base upon which to predict con- 
. sumer spending at merchants than superficial demographic labels or categories. Thus, consumer CVs vector is very 
50 strongly aligned with the merchant vectors of merchants A. C. and E. indicating CI is likely to be interested in the prod- 
ucts and services of these merchants. CVs vector can be aligned with these merchants, even if CI never purchased at 
any of them before. Thus, merchants A. C. and E have a clear means for identifying consumers who may be interested 
in purchasing from them. 

[0040] Which consumers are associated with which merchant segments can also determined by a membership 
55 function. This function can be based entirely on the merchant segment vectors and the consumer vectors (e.g. dot prod- 
uct), or on other quantifiable data, such as amount spent by a consumer in each merchant segment, or a predicted 
amount to be spent. 

[0041] Given the consumers who are members of a segment, useful statistics can be generated for the segment. 
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such as average amount spent spending rate, ratios of how much these consumers spend in the segment compared 
with the population average, and so forth. This information enables merchants to finely target and promote their prod- 
ucts to the appropriate consumers. 

[0042] Rg. 2 illustrates portions of a sample index of merchant segments, as may be produced by the present 
invention. Segments are named t>y assigning each segment a unique segment number 200 between 1 and M the total 
number of segments. In addition, each segment has a description field 210 which describes the merchant segment A 
prefen'ed description field is of the form: 

Major Categories: Minor Categories: Demographics: Geography 

[0043] Major categories 202 describe how the customers in a merchant segment typicatly use their accounts. Uses 
include retail purchases, direct marketing purchases, and where this type cannot be determined, then other m^or cat- 
egories, such as travel uses, educational uses, services, and the like. Minor categories 204 describe both a subtype of 
the major category (e.g. subscriptions being a subtype of direct marketing) or the products or services purchased in the 
transactions (e.g. housewares, sporting goods, furniture) commonly purchased in the segment. Demographics informa* 
tion 206 uses account data from the consumers v^o frequent this segment to describe the most frequent or average 
demographic features, such as age range or gender, of the consumers. Geographic information 208 uses the account 
data to describe the most common geographic location of transactions in the segment. In each portion of the segment 
description 210 one or more descriptors may be used (i.e. multiple major, minor, demographic, or geographic descrip- 
tors). This naming convention is much more powerful and fine-grained than conventional SIC classifications, and pro- 
vides insights into not just the industries of different merchants (as in SIC) but more Importantly, into the geographic, 
approximate age or gender, and lifestyle choices of consumers in each segment. 

[0044] The various types of segment reports are further described in section I, Reporting Engine, below. 

B. SYSTEM OVERVipA/ 

[0045] Turning now to Fig. 4a there is shown an illustration of a system architecture of one embodiment of the 
present invention during operation in a mode for predicting consumer spending. System 400 includes begins with a data 
preprocessing module 402. a data postprocessing module 410, a profiling engine 412. and a reporting engine 426. 
Optional elements include a segment transition detection engine 420 and a targeting engine 422. System 400 operates 
on different types of data as inputs. Including consumer summary file 404 and consumer transaction file 406. generates 
interim models and data, including the consumer profiles in profile databEise 414, merchant vectors 41 6, merchant seg- 
ment predictive models 418. and produces various useful outputs including various segment reports 428-432. 
[0046] Rg. 4b illustrates system 400 during operation in a training mode, and here additionally include predictive 
model generation system 440. 

C, FUNCTIONAL. OVERVIEW 

[0047] Refemng now to Fig. 3. there is shown a functional overview the processes supported by the present inven- 
tion. The process flow illustrated and described here is exemplary of how the present invention may be used, but does 
not limit the present invention to this exact process flow, as variants may be easily devised. 

[0048] Generally then, master files 408 are created or updated 300 from account transaction data for a large col- 
lection of consumers (account holders) of a financial institution, as may be stored in the consumer summary files 404 
and the consumer transaction files 406. The master files 408 collect and organize the transactions of each consumer 
from different statement periods into a date ordered sequence of transaction data for each consumer. Processing of the 
master files 408 normalizes merchant names in the transaction data, and generates frequency statistics on the fre- 
quency of occurrence of merchant names. 

[0049] In a training mode, the present invention creates or updates 302 merchant vectors associated with the mer- 
chant names. The merchant vectors are based on the co-occurrence of merchants names in defined co-occurrence 
windows (such as a number of transactions or period of time). Co-occurrence statistics are used to derive measures of 
how closely related any two merchants are based on their frequencies of co-occurrence with each other, and with other 
merchants. The relationship measures in turn influence the positioning of merchant vectors in the vector space so that 
merchants who frequently co -occur have vectors which are similarly oriented in the vector space, and the degree of 
similarity of the merchant vectors is a function of their co-occurrence rate. 

[0050] The merchant vectors are then clustered 304 into merchant segments. The merchant segments generally 
describe groups of merchants which are naturally (in the data) shopped at together" based on the transactions of the 
many consumers. Each merchant segment has a segment vector computed for it, which is a summary (e.g. centroid) 
of the merchant vectors in the merchant segment. Merchant segments provide very rich information about the mer- 
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chants that are members of the segments. Including statistics on rates and volumes of transactions, purchases, and the 
like 

roosil With the merchant segments now defined, a predictive model of spending behavior is created 306 tor each 
merchant segment The predictive model for each segment Is derived from observations of consumer transacuons in 
^o?me iriods: an time window and a subsequent p«dicdon time window. Data from >"^^« '"J^^ 

«^?e w'mdow for each consumer Oncluding both segment specific and cross-segment) Is used to extract independent 
rriaJ^Cand actual spending in the prediction window p«>vldes the dependent variable. The mdependent vanab es 
describe the rate, frequency, and monetary amounts of spending In all segments and in the segment be ng 
Sed. ASluLrvec;or derived f«^ consumer's transactions may also be used Validation and ana^sis of the 

seament predictive models may done to confirm the performance of the models. ^^^^ 

mo521 Tn the production phase, the system is used to predict spending, either In fixture time periods tor which there 
sTo actu J date L of yet. or in a recent past time period for which data is available and which is used tor re»ospect.ve 
anatyS GeSj. e^ account (or consumer) has a profile summarizing the transactional behav or of the account 
Sofder Thfe irrforrnation is created, or updated 308 with racent transaction data if present, to generate the appropnate 
I^SeTfor itril the predictive models for the segments. (Gene^^^^ 
*»ration mav also involve updating 308 of account profiles.) 

S EaSh l^unt farther includes a consumer vector which is derived. e.g. as a surrjmary ve«or Jhe m^^ 
Sectors of the merchant atwhich the consumerhas purchased inadefinedtimepenod. say the las^^ 
Each merchant vector's contribution to the consumer vector can be weighted by the consumers ^'^"sacJ^Dns at 
merchants, such as by transaction amounte. rates, or racency. The consumer vectors, .n conjuncjon wjh tt^e me«han 
Segment vectors prtJide an initial level of predictive power. Each consumer can now be associated with the merchant 
seament having a merchant segment vector closest to the consumer vector for the consumer. . 
fo0541 S the updated Account prcfiles. this date is input into the set of predicth,e modete to generate 310 to 
each consumer «^ amount of predicted spending In each merchant segment in a desired pred.ct.on t.me penod. For 
exam~e tSe p ;^ctive models may be trSned on a six month input window to preset spending in = -''^e;-"^^^^^^ 
monTh prediction window The predtoted period may be an actual future period or a current (e.g. recently ended) period 

S;:r ~*lXeX.evels and consumer pre«les allow for various .eve. and ^^l^^^^^';^^^^^^ 
rnent anavste 312. Firet. each accoum may be analyzed to detem^ine which segment (or segments) the aoKJunt s a 
Teler otbased on various memberehip functions. A preferred memberehip function is *e P;«-'««^ 
^o that eac^h consumer is a member of the segment tor which they have the highest predurted «P«"*"9-^°^^«: .^^^^^ 
ures of association between accounte and segmente may be based on percentile rankings of each consume.'s pre- 
dSed spS acroS^he various merchant segmente. With any of these (or similar) methods of determining whu^h 
con ur^e s ra'oclated with which segmente. an analysis of the rates and volumes of ^'i^^'^"* ^^^^^^"^^^^^ 
by consumers in each segment can be generated. Further, tergetlng of accounts in "JJ^^^^^^^^^^ 
to selectivelv identify populations of consumers with predicted high dollar amount ortransacton rates. Account analyse 
aVo iSes co^^^^^ who have transitioned between segments as ind-.cated by increased or decreased member- 

Sr^^^Using tergetlng criteria, prcmotions directed 314 to specific consumers in specific ^^gmente and the mer- 
.0 chanti in those segments can be realized. For example, given a merchant segment, the J^^^ 
levels (or rankings) of predicted spending In the segment may be identified, or the consumers having consumer vectore 
To est to the seUen! vector m^y be selected. Or. the consumers who have ^ '^^^ '^^^^^^ 
in a segment may be selected. The merchante which make up the segment are known from the segment cluster^g 304. 
one or more promotional offers specific to merchants in the segment can be created such as d^counte 
the like. The merchant-specific premotional offers are then directed to the selected consurners. Since these aaount 
' r- _^ I -« ^ ^..^'..^^ fh A canm Ant the nromotional Offers Den- 
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the iKe. ne mercnani-speciiic piwmwuwMcn wn*.** — u.^„ 

holders have been identified as having the greatest likelihood of spending in the segment, the P"'"^ 
eficially coincide with their predicted spending behavior. This desirably results In an .ncrease success rate at which the 

fo^r" The^^and iTuTes' and app.teations of the present invention will be apparent to those of ski., in the art. 
n DATA PRF PROCESSINft MODULE 

fOOSBl The data preprocessing module 402 (DPM) does initial processing of consumer date receded from a source 
oSumeT accounte and transaalons. such as a credrt card issuer, in preparation for creating the nrierchant vectors 
consumeT^L^Z and merchant segment predictive modete. DPM 402 is used in both prcduction and training modes, 
(in this disclosure, the terms 'consumer.- 'customer.' and 'account holder- are used Interchangeably). 
00591 The mpute tor the OPM are the consumer summaiy file 404 and the consumer transaction file 406. Gener- 
Sly the consumer summary file 404 prcvldes account date on each consumer who transaction date is to be prccessed. 
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such as account number and other account identifying and descriptive information. The consunr^er transaction file 406 
provides details of each consumer's transactions. The DPM 402 processes ttiese files to organize both sets of data by 
account identifiers of the consumer accounts, and merges the data files so that each consumer's summary data is avail- 
able with their trar\sactions. 

5 [0060] Customer summary file 404: The customer summary file 404 contains one record for each customer mat is 
profiled by the system, and includes account Information of the customer's account, and optionally includes demo- 
graphic information about the customer. The consumer summary file 404 is typically one that a financial institution, such 
as a bank, credit card Issuer, department store, and the lilce maintains on each consumer. The customer or the financial 
institution may supply the additional demographic fields which are deemed to be of infomnational or of predictive value. 

w Examples of demographic fields Include age, gender and income; other demographic fields may be provided, as 
desired by the finandal institution. 

[0061] Table 1 describes one set of fields for the customer summary file 404 for a preferred embodiment Most 
fields are self-explanatory. The only required field is an account Identifier that uniquely identifies each consumer 
account and transactions. This account identifier may be the same as the consumer's account number, however, it is 

IS preferable to have a different identifier used, since a consumer may have multiple account relationships with the finan- 
cial institution (e.g. multiple credit cards or bank accounts), and all transactions of the consumer should be dealt witti 
together. The account Identifier Is preferably derived from the account number, such as by a one-way hash or encrypted 
value, such that each account identifier is uniquely associated with an account number. The pop_/d field is optionally 
used to segment the population of customers into arbitrary distinct populations as specified by the financial institution. 

20 for example by payment history, account type, geographic region, etc, 



Table 1 



Customer Summary File 


Description. 


Sample Format 


Accountjd 


Charlmax 24] 


Popjd 


CharCV-'N') 


Account number 


Chartmax 16] 


Credit bureau score 


Short int as string 


Internal credit risk score 


Short int as string 


Ytd purchases 


Int as string 


Ytd_cash_adv 


Int as string 


Ytd_int_purchases 


int as string 


Ytd_int_cash_adv 


Int as string 


State.code 


Chartmax 2] 


Zip_code 


Chartmax 5] 


Demographic_l 


int as string 






Demographic.N 


Int as string 



(0062] Note the additional, optional demographic fields for containing demographic infonnation about each con- 
so sumer In addition to demographic information, various summary statistics of the consumer's account may be included. 
These include any of the following: 
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Table 2 



10 



15 



20 



25 



30 



Exampie Demographic Fields for Customer Summary File 


tjescnpiion 


Explanation 










• 1 J ■■».■-». ij-i MM tWfi o^^m mt 


Equivalent to number of plastics 










Credit line 




Open to buy 




Initial month staiemeni oaiance 


' Balance on the account prior to the first month of transaction data pull 


Last month statement balance 


Balance on the account at the end of the transaction data pulled 


Monthly payment amount 


For each month of transaction data contributed or the average over last year. 


Monthly cash advance amount 


For each month of transaction data contributed or the average over last year. 


Monthly cash advance count 


For each month of transaction data contributed or tfie avenge over last year. 


Monthly purchase amount 


For each month of transaction data contributed or the avenge over last year. 


Monthly purchase count 
Monthly cash advance interest 


For each month of transaction data contributed or the average over last year. 
For each month of transaction data contributed or the avenge over last year. 


Monthly purchase interest 


■ For each month of transaction data contributed or the avenge over last year. 


Monthly late charge 


" For each month of transaction data contributed or the average over last year. 



100631 consumer transaction file 406. The consumer transaction file 406 contains transaction level dataforthe con- 
iumlJs in t^e^rl^umeTsum^^^ tile. The shared Key is the account.^. In a preferred emt,odiment. the transacton 

35 file has the following description. 



Table 3 



45 



50 



55 



Consumer Transaction File 


Description 


Sample Format 


Account_id 


Quoted char(24) - [0-9) 


Account^number 


Quoted char(16) -[0-9] 


Popjd 


Quoted char(1)- [0-128] 


Transaction_code 


Integer 


Transaction^amouni 


Float 


Transaction ^ttme 


HH:MM:SS 


Transaction_date 


YYYYMMDD 


Transaction_type . 


Char(5) 


SIC_code 


Cnar(5) - [0-9] 


Merchant_descriptor 


Char(25) 


SKU Number 


Variable length list 
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Table 3 (continued) 



Consumer Transaction File 


Description 


Sample Fomnat 


Merchant zip code 


Charfmax 5] 



[0064] The SKU and merchant zip code data are optional, and may be used for more fine-grained filtering of which 
transactions are considered as co-occurring. 

10 [0085] ' The output for the DPM is the collection of master files 408 containing a merged file of the account infomna- 
tion and tran^ction infonnation for each consumer. The master file is generated as a preprocessing step before input- 
ting data to the profinng engine 412. The master file 408 is essentially the customer summary file 404 with the 
consumer's transactions appended to the end of each consumer's account record. Hence the master file has variable 
length records. The master files 408 are preferably stored in a database format allowing for SQL querying. There is one 

IS record per account identifier. 

[0066] In a prefen-ed embodiment, the master files 408 have the following information: 



Table 4 



20 



23 



30 



35 



40 



Master File 408 


Description 


Sample Format 


Account Jd 


Charfmax 24] 


Popjd 


Char('r-'N') 


Account_number 


Char{max 1 6] 


Credit bureau score 


Short int as string 


Ytd purchases 


Int as string 


Ytd_cash_advances 


Int as string 


Ytdjnterest^on jjurchases 


Int as string 


YtdJnterest„on_cash_advs 


Int as string 


State_code 


Chartmax 2) 


Demographic. 1 


Int as string 






Demographic_N 


Int as string 


(transactions) 





45 



SO 



55 



[0067] The transactions included tor each consumer include the various data fields described above, and any other 
per-transaction optional data that the financial institution desires to track. 

[0068] The master file 408 preferably includes a header that indicates last update and number of updates. The 
master file may be incrementally updated with new customers and new transactions for existing customers. The rnaster 
file database is preferably be updated on a monthly basis to capture new triansactions by the financial institution's con- 
sumers. 

[0069] The DPM 402 creates the master file 408 from the consumer summary file 404 and consumer transaction 
file 406 by the following process: 

a) Verity minimum data requirements. The DPM 402 determines the number of data files it is handling (since there 
maybe many physical media sources), and the length of the files to determine the number of accounts and trans- 
actions. Preferably, a minimum of 12 months of transactions for a minimum of 2 million accounts are used to provide 
fully robust models of merchants and segments. However, there is no formal lower bound to the amount of data on 
which system 400 may operate. 

b) Data cleaning. The DPM 402 verifies valid data fields, and discards invalid records. Invalid records are records 
that are missing the any of the required fields for the customer summary file of the transaction file. The DPM 402 
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also indicates missing values for fields that have corrupt or missing data and are optional. Duplicate transactions 
are eliminated using account ID, account number, transaction code, transaction amount, date, and merchant 
description as a key. 

c) Sort and merge files. The consumer summary file 404 and the consumer transaction fiie 406 are both sorted t>y 
account ID; the consumer transaction file 406 is further sorted by transaction date. Additional sorting of the trans- 
action file, for example on time, type of transaction, merchant zip code, may be applied to further influence the 
detemiination of merchant co-occurrence. The sorted files are merged into the master file 408, with one record per 
account as described above. 



10 [0070] Due to the large volume of data involved in this stage, compression of the master files 408 is preferred, 
where on-the-fly compression and decompression is supported. This often improves system performance due to 
decreased I/O. In addition, as illustrated in Fig. 4a. the master file 408 may be split into multiple subfiles, such as split- 
ting by population ID, or other variable, again to reduce the amount of data being handled at any one time. 

75 p . PREDICTIVE MODEL GENERATION SYSTEM 

[0071] Refening to Fig. 4b, the predictive model generation system 440 takes as its inputs the master file 408 and 
creates the consumer profiles and consumer vectors, the merchant vectors and merchant segments, and the segment 
predictive models. This data is used by the profiling engine to generate predictions of fixture spending by a consumer 
20 in each merchant segment using inputs from the data postprocessing module 410. 

[0072] Fig. S illustrates one embodiment of the predictive model generation system 440 that includes three mod- 
ules: a merchant vector generation module 510, a clustering module 520, and a predictive model generation module 
530. 



25 1, Merchant Vector Generation 

[0073] Merchant vector generation is application of a context vector type analysis to the account data of the con- 
sumers, and more particularly to the master files 408. The operations for merchant vector generation are managed by 
the merchant vector generation module 510. 

30 [0074] In order to obtain the initial merchant vectors, additional processing of the master files 408 precedes the 
analysis of which merchants co-occur in the master files 408. There are two, sequential, processes that are used on the 
merchant descriptions, stemming and equh^lencing. These operations normalize variations of individual merchants 
names to a single common merchant name to allow for consistent identification of transaction at the merchant. This 
processing is managed by the vector generation module 510. 

35 [0075] Stemming is the process of removing extraneous characters from the merchant descriptions. Examples of 
extraneous characters include punctuation and trailing numbers. Trailing numbers are removed because they usually 
indicate the particular store in a large chain (e.g. Wal-Mart #12345). It is preferable to identify all the outiets of a partic- 
ular chain of stores as a single merchant description. Stemming optionally converts all letters to lower case, and 
replaces all space characters with a dash. This causes all merchant descriptions to be an unbroken string of non-space 

40 characters. The lower case constraint has the advantage of making it easy to distinguish non-stemmed merchant 
descriptions from stemmed descriptions. 

[0076] Equivalencing is applied after stemming, and identifies various different spellings of a particular merchants 
description as being associated with a single merchant description. For example, the •Roto-Rooter' company may 
occur in the transaction data with the following three stemmed merchant descriptions: "ROTO-ROOTER-SEWER- 
45 SERV". "ROTO-ROOTER-SERVICE", and "ROTO-ROOTER-SEWER-DR". An equivalence table is set up containing a 
root name and a list of all equivalent names. In this example. ROTO-ROOTER-SEWER-SERV becomes the root name, 
and the latter two of these descriptions are listed as equivalents. During operation, such as generation of subsequent 
master files 408 (e.g. the next monthly update), an identified equivalenced name is replaced with its root name from the 
equivalence table. 

50 [0077] In one embodiment, equivalencing proceeds in two steps, with an optional third step. The first equivalencing 
step uses a fu22y trigram matching algorithm that attempts to find merchant descriptions with nearly identical spellings. 
This method collects statistics on all the trigrams (sets of three consecutive letters in a word) in all the merchant descrip- 
tions, and maintains a list of the trigrams in each merchant description. The method then determines a closeness score 
for any two .merchant names that are supplied for comparison, based on the number of trigrams the merchant names 

55 have in common. If the two merchant names are scored as being sufficiently close, they are equivalenced. Appendix I. 
below, provides a novel trigram matching algorithm useful for equivalencing merchant names (and other strings). This 
algorithm uses a vector representation of each trigram. based on trigram frequency in data set, to construct trigram vec- 
tors, and judges closeness based on vector dot products. 
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also indicates missing values for fields that have corrupt or nnissing data and are optional. Duplicate transactions 
are eliminated using account ID, account number, transaction code, transaction amount, date, and merchant 
description as a key. 

c) Sort and merge files. The consumer summary file 404 and the consumer transaction file 406 are both sorted by 
5 account ID; the consumer tran^ction file 406 is further sorted by transaction date. Additional sorting of the trans- 

action file, for example on time, type of transaction, merchant zip code, may be applied to further influence the 
determination of merchant co-occurrence. The sorted files are merged into the master file 408. with one record per 
account, as described above. 

10 [0070] Due to the large volume of data involved in this stage, compression of the master files 408 Is preferred, 
where on-the-fly compression and decompression Is supported. This often improves system performance due to 
decreased I/O. In addition, as illustrated in Rg. 4a, the master file 408 may be split into multiple subfiles, such as split- 
ting by population ID, or other variable, again to reduce the amount of data being hemdied at any one time. 

15 g PREDICTIVE MODEL GENERATION SYSTEM 

[0071] Refening to Fig. 4b. the predictive model generation system 440 takes as Its inputs the master file 408 and 
creates the consumer profiles and consumer vectors, the merchant vectors and merchant segments, and tiie segment 
predictive models. This data Is used by the profiling engine to generate predictions of fixture spending by a consumer 
20 in each merchant segment using inputs f ronn the data postprocessing module 41 0. 

[0072] Fig. 5 illustrates one embodiment of the predictive model generation system 440 that includes three mod- 
ules: a merchant vector generation module 51 0. a clustering module 520, and a predictive model generation module 
530. 



25 1. Merchant Vector Generation 

[0073] Merchant vector generation is application of a context vector type analysis to the account data of the con- 
sumers, and more particulariy to the master files 408. The operations for merchant vector generation are managed by 
the merchant vector generation module 510. 

30 [0074] In order to obtain the initial merchant vectors, additional processing of the master files 408 precedes the 
analysis of which merchants co-occur in the master files 408. There are two, sequential, processes that are used on the 
merchant descriptions, stemming and equlvalencing. These operations normalize variations of individual merchants 
names to a single common merchant name to allow for consistent Identification of transaction at the merchant. This 
processing is managed by the vector generation module 510. 

35 [0075] Stemming is the process of removing extraneous characters from the merchant descriptions. Examples of 
extraneous characters include punctuation and trailing numbers. Trailing numbers are removed because they usually 
indicate the particular store in a large chain (e.g. Wal-Mart #1 2345). It is preferable to identrfy all the outiets of a partic- 
ular chain of stores as a single merchant description. Stemming optionally converts all letters to lower case, and 
replaces alt space characters with a dash. This causes all merchant descriptions to be an unbroken string of non-space 

40 characters. The lower case constraint has the advantage of making it easy to distinguish non-stemmed merchant 
descriptions from stemmed descriptions. 

[0076] Equlvalencing is applied after stemming, and identifies various different spellings of a particular merchants 
description as being associated with a single merchant description. For example, the "Roto-Rooter* company may 
occur in the transaction data with the following three stemmed merchant descriptions: "ROTO-ROOTER-SEWER- 
45 SERV-, "ROTO-ROOTER-SERVICE". and "ROTO-ROOTER-SEWER-DR". An equivalence table is set up containing a 
root name and a list of all equivalent names. In this example. ROTO-ROOTER-SEWER-SERV becomes the root name, 
and the latter two of these descriptions are listed as equivalents. During operation, such as generation of subsequent 
master files 408 (e.g. the next monthly update), an identified equivalenced name is replaced with its root name from the 
equivalence table. 

50 [0077] In one embodiment, equivalencing proceeds in two steps, with an optional third step. The first equlvalencing 
step uses a fuzzy trigram matching algorithm that attempts to find merchant descriptions with nearly identical spellings. 
This method collects statistics on all the trigrams (sets of three consecutive letters in a word) in all the merchant descrip- 
tions, and maintains a list of the trigrams in each merchant description. The method ttien determines a closeness score 
for any two merchant names that are supplied for comparison, based on the number of trigrams the merchant names 

55 have in common. If the two merchant names are scored as being sufficiently close, they are equivalenced. Appendix I, 
below, provides a novel trigram matching algorithm useful for equivalencing merchant names (and other strings). This 
algorithm uses a vector representation of each trigram. based on trigram frequency in data set, to construct trigram vec- 
tors, and judges closeness based on vector dot products. 
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[0078] Preferably, equivalencing is applied only to merchants that are assigned the same SIC code. This constraint 
is useful since two merchants may have a similar name, but if they are in different SIC classifications there is a good 
chance that they are. in fact, different businesses. 

[0079] The second equivalencing step consists of fixing a group of spedat cases. These special cases are identi- 
5 fied as experience is gained with the particular set of transaction data being processed. There are two broad classes 
that cover most of these special cases: a place name is used Instead of a number to identify specific outlets in a chain 
of stores, and some department stores append the name of the specific department to the name of the chain. An exam- 
ple of the first case is U-Haul. where stemmed descriptions look like U-HAUL-SAN-DIEGO. U-HAUL-ATLANTA, and the 
like. An example of the second case is Robinsons-May department stores, with stemmed descriptions like ROBINSON- 
10 MAY-LEE-WOMEN. ROBINSONMAY-LEVI-SHORT. ROBiNSONMAY-TRIFARI-CO. and ROB IN SO NM AY-JAN E- 
ASHLE. In both cases, any merchant description in the con-ect SIC codes that contain the root name (e.g. U-HAUL or 
ROBINSONMAY) are equivalenced to the root name. 

[0080] A third, optional step includes a manual inspection and correction of the descriptions for the highest fre- 
quency merchants. The number of merchants subjected to this inspection varies, depending upon the time constraints 

15 in the processing stream. This step catches the cases that are not amenable to the two previous steps. An example is 
Microsoft Network, with merchant descriptions like MICROSOFT-NET and MSN-BILLING. With enough examples from 
the transaction data, these merchant descriptors can also be added to the special cases in step two, above. 
[0081] Preferably, at least one set of master files 408 is generated before the equivalencing is determined. This is 
desirable In order to compile statistics on frequencies of each merchant description within each SIC code before the 

20 equivalencing is started. 

[0082] Once the equivalencing table is constructed, the original master files 403 are re-built using the equivalenced 
merchant descriptions. This steps replaces all equivalenced merchant descriptors with their associated root names, 
thereby ensuring that all transactions for the merchant are associated with the same merchant descriptor. Subsequent 
incoming transaction data can be equivalenced before it is added to the master files, using the original equivalence 
25 table. 

[0083] Given the equivalence table, a merchant descriptor frequency list can be determined describing the fre- 
quency of occurrence of each merchant descriptor (including its equivalents). 

[0084] Once the equivalence table is defined an initial merchant vector is assigned to each root name. The mer- 
chant vector training based on co-occurrence is then perfomned. processing the master files by account ID and then by 
30 date as described above. 

2. Training of Merchant Vectors: The UDL Algorithm 

[0085] As noted above, the merchant vectors are based on the co-occun^ence of merchants in each consumer's 
35 transaction data. The master files 408, which are ordered by account and within account by transaction date, are proc- 
essed by account and then in date order to identify groups of co-occurring merchants. The co-occurrence of merchant 
names (oru:e equivalenced) is the basis of updating the values of the merchant vectors, 

[0086] The training of merchant vectors is based upon the unexpected deviation of co-occurrences of merchants in 
transactions. More particularty. an expected rate at which any pair of merchants co-occur in the transaction data is esti- 

40 mated based upon the frequency with which each individual merchant appears in co-occurrence with any other mer- 
chants, and a total number of co-occurrence events. The actual number of co-occurrences of a pair of merchants is 
determined. If a pair of merchants co-occur more frequently then expected, then the merchants are positively related, 
and the strength of that relationship is a function of the "unexpected" amount of co-occurrence. If the pair of merchants 
co-occur less frequently then expected, then the merchants are negatively related. If a pair of merchants co-occur in 

45 the data about the same as expected, then there is no generally relationship between them. Using the relationship 
strengths of each pair of merchants as the desired dot product between the merchant vectors, the values of the mer- 
chant vectors can be determined in the vector space. This process is the basis of the Unexpected Deviation Leaming 
algorithm or "UDL", 

[0087] This approach overcomes the problems associated with conventional vector based models of representa- 
50 tion. which tend to be based on overall frequencies of terms relative to the database as a whole. Specifically. In a con- 
ventional model, the high frequency merchants, that is merchants for which there are many, many purchases, would co- 
occur with many other merchants, and either falsely suggest that these other merchants are related to the high fre- 
quency merchants, or simply be so heavily down-weighted as to have very little influence at all. That is. a high fr5equency 
merchant names would be treated as high frequency English words like "the" and "and", and so forth, which are given 
55 very low weights in conventional vector systems specifically because of their high frequency. 

[0088] However, the present invention takes account of the high frequency presence of individual merchants, and 
instead analyses the expected rate at which merchants, including high frequency merchants, co-occur with other mer- 
chants. High frequency merchants are expected to co-occur more frequently. If a high frequency merchant and another 
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merchant co-occur even more frequently than expected, then there is a positive conrelation between them. The present 
invention thus accounts tor the high frequency merchants in a manner that conventional methodologies cannot. 
(0089] The overall process of modeling the merchant vectors using unexpected deviation is as follows: 

5 1 . First, count the number of times that the merchants co-occur with one another in the transaction data The intu- 

ition is that related merchants occur together often, whereas unrelated merchants do not occur together often. 

2. Next, calculate the relationship strength between merchants based on how much the observed co-occurrence 
deviated from the expected co-occun-ence. The relationship strength has the following characteristics: 

w • Two merchants that co-occur significantly more often than expected are positively related to one another. 

• Two merchants that co-occur significantly less often than expected are negath^ety related to one another. 

• Two merchants that co-occur about the number of times expected are not related. 

3. Map the relationship strength onto vector space; that is. detennine the desired dot product between the mer- 
15 chant vectors for all pairs of items given their relationship strength. The mapping results in the following character- 
istics: 

• The merchant vectors for positively related merchants have a positive dot product. 

• The merchant vectors for negatively related merchants have a negative dot product 
20 • The merchant vectors for unrelated merchants have a zero dot product, 

4. Update the merchant vectors from their initial assignments, so that the dot products between them at least 
closely approximate the desired dot products. 

25 [0090] The next sections explain this process In further detail. 

a) Co-occun-ence Counting 

[0091] Co-occurrence counting is the procedure of counting the number of times that two items, here mercliant 
30 descriptions, co-occur within a fixed size co-occurrence window in some set of data, here the transactions of the con- 
sumers. Counting can be done forwards, backwards, or bi-directionally The best way to illustrate co-occurrence count- 
ing is to give an example for each type of co-occurrence count: 
[0092] Example: Consider the sequence of merchant names: 

35 Ml M3 Ml M3 M3 M2 M3 

where Ml. M2 and M3 stands for arbitrary merchant names as they might appear in a sequence of transactions by a 
consumer. For the purposes of this example, intervening data, such dates of transactions, amounts, transaction identi- 
fiers, and the like, are ignored. Further assume a co-occun*ence window with a size = 3. Here, the co-occurrence win- 
40 dow is based on a simple count of items or transactions, and thus the co-occurrence window represents a group of 
three transactions In sequence. 

i> Forward co-occurrence counting 

45 [0093] The first step in the counting process is to set up the forward co-occunrence windows. Fig. 6a illustrates the 
co-occurrence windows .602 for forward co-occurrence counting of this sequence of merchant names. By definition, 
each merchant name is a target 604. indicated by an arrow, for one and only one co-occurrence window 602. Therefore, 
in this example there are seven forward co-occun-ence windows 602, labeled 1 through 7. The other merchant names 
within a given co-occun-ence window 602 are called the neighbors 606. In forward co-occurrence counting, the neigh- 
so bors occur after the target. For window size = 3 there can be at most three neighbora 606 within a given co-occun-ence 
window 602. Obviously, the larger the window size, the more merchants (and transactions) are deemed to co-occur at 
a time. 

[0094] The next step is to build a table containing all co-occunrence events. A co-occun-ence event is simply a pair- 
ing of a target 604 with a neighbor 606. For the co-occurrence window #1 in Fig. 6a, the target is Ml and the neighbors 
55 are M3. Ml , and M3. Therefore, the co-occurrence events in this window are: (Ml . M3). (Ml , M1), and (Ml , M3). Table 
5 contains the complete listing of co-occurrence events for every co-occurrence window in this example: 
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Tables 



5 



10 



Forward co-occurrence event table 


Co-oocunrence Window 


Target 


Neighbor 


1 


M1 


MS 


1 


M1 


Ml 


1 


Ml 


M3 


2 


M3 


M1 


2 


M3 


MS 


2 


M3 


MS 


3 


M1 


MS 


3 


M1 


MS 


3 


M1 


M2 


4 


M3 


MS 


4 


M3 


M2 


4 


M3 


MS 


5 


M3 


M2 


5 


M3 


MS 


6 


M2 


MS 



30 [0095] The last step is to tabulate the number of times that each unique co-occurrence event occurred. A unique 
co-occurrence event is the combination (in any order) of two merchant names. Table 6 shows this tabulation in matrix 
form. The rows indicate the targets and the columns indicate the neighbors. For future reference, this matrix will be 
called the forward co-occurrence matrix. 

35 

Tabic 6: Forward Co-cccurrencc matrix 
Neighbor 









Ml 


M2 


M3 




40 




Ml 


1 


1 


4 


6 




Target 


M2 


0 


0 


1 


1 


45 




M3 


1 


2 


5 


8 








2 


3 


10 


15 



ii^ Backward co-occun-ence ccuntino 

[0096] Backward co-occurrence counting is done in the same manner as fonwand co-occurrence counting except 
55 that the neighbors precede the target in the co-occun-ence windows. Fig. 6b illustrates the co-occun-ence windows for 
the same sequence of merchant names for backward co-occurrence counting. 

[0097] Once the co-occurrence windows are specified, the co-occurrence events can be identified and counted. 
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Table? 

Backward co-occurrence event table 



Co-occun^nce Window 


Target 




1 


M3 


M2 


1 


IVIO 


M3 


1 


Mo 


M3 




M2 


M3 


2 




M3 




M2 


M1 


3 


M3 


M3 


3 


M3 


M1 


3 


M3 


M3 


4 


M3 


Ml 


4 


M3 


M3 


4 


M3 


M1 


5 


Ml 


M3 


5 


M1 


m 


6 


M3 


M1 



30 [0098] The number of times that each unique co 
occurrence matrix. 



.occun-ence event occurred is then recorded in the backward co- 



35 


Table 8: Backward Co-oeeutrence matrix 
Neighbor 
Ml M2 M3 






Ml 


1 


0 


4 


2 


40 


Target M2 


1 


0 


2 


3 




M3 


4 


1 


5 


10 


45 




6 


1 


8 


15 



roMQI Note that the forward co-occurrence rr^atrix and the backward co-occurrence matrix are the transpose of one 

^^^^^ 

then the transpose the resulting co-occurrence matrix taken to obtain get the other. 



55 



iii^ Bi-directional co-oc currence counting 

roiOOl The bi-directional co-occurrence matrix is just the sum of the forward co-occurrence matrix and theba^- 
'^aS^^o-ocLrrerlce matrix. The resuming matrix will always be symmetric, .n f - ^'^^^Z^^^";^^^^^ 
merchant names A and B is the same as the co-occurrence between merchant names B and A. ThK property s 
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able because this same symmetry is inherent in vector space; that is for merchant vectors and Ve for merchants A 
and B. 

For this reason, the pretended embodiment uses the bi-directional co-occurrence matrix. 

Table 9: Bi-directional Co-cccurrence matrbi 



75 



20 





Ml 


Neighbor 


M3 




Ml 


2 


1 


5 


8 


M2 


I 


0 


3 


4 


M3 


S 


3 


10 


18 




8 


4 


18 


1 30 



25 



[0101] Rgs 7a and 7b illustrate the above concepts in the context of consumer transaction data in the master files 
408 In Fig 7a there is shown a portion of the masterfile 408 containing transactions of a particular customer. This data 
is prior to the stemming and equivalencing steps described above, and so includes the original names of the merchants 
with spaces, store numbers and locations and other extraneous data. 
30 [0102J Fig. 7b illustrates the same data after stemming and equivalencing. Notice that the two transactions at STA- 
PLES which previously identified a store number are now equivalenced. The two car rental transactions at ALAMO 
which transactions previously included the location are equivalenced to ALAMO, as are two hotel stays at HILTON 
which also previously included the hotel location. Further note that the HILTON transactions specified the location prior 
to the hotel name. Finally, the two transactions at NORDSTROMS which previously identified a department have been 
35 equivalenced to the store name itself. 

[0103] Further, a single forward co-occurrence window 700 is shown with the target 702 being the first transaction 
at the HILTON, and the next three transactions being neighbors 704. 

[0104] Accordingly, following the updating of the master files 408 with the stemmed and equivalenced names, the 
merchant vector generation module 610 performs the following steps for each consumer account: 

1 . Read the transaction data in date order. 

2. Fonward count the co-occurrences of merchant names in the transaction data, using a predetermined co-occur- 
rence window. 

3. Generate the fonward co-occurrence, backward co-occurrence and bi-directional co-occurrence matnxes, 

[0105] One preferred embodiment uses a co-occurrence window size of three transactions. This captures the trans- 
actions as the co-occurring events (and not the presence of merchant names within three words of each other) based 
only on sequence. In an alternate embodiment me co-occurrence window is time-based using a date range in order to 
identify co-occuning events. For example, with a co-occurrence window of 1 week, given a target transaction, a co- 
occurring neighbor transaction occurs within one week of the target transaction. Yet another date approach is to define 
the target not as a transaction, but rather as a target time period, and then the co-occurrence window as another time 
period For example, the target period can be a three month block and so all transactions within the block are the tar- 
gets and then the co-occurrence window may be all transactions in the two months following the target penod. Thus, 
each merchant having a transaction in the target period co-occurs with each merchant (same or other) having a trans- 
action in the co-occurrence period. Those of skill in the art can read.ty devise alternate co-occun-ence definitions which 
capture the sequence and/or time related principles of co-occun-ence in accordance with the present invention. 
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^) P.« ^imatinq Expected Cq .nr^^i|rrence Counts 

miofil in order to determine whether two merchants are related, the UDL algorithm uses an estimate about the 
101061 In order lo aeiermine wi. ^^^.^ ewsected to occur Suppose the only information known 

?o,o^ NOW suppose that it Is desired pr«lictthe number of times two a*ltrary merchants, meorhanti and mer^anj 
Lo-S2ur in me a^^nce of any additional infommtion we would have to assume that merchant and mejchan^ a^ not 
co'eTId in te^ o^^robab^^^ this means that the occurrence of a tmnsaction at merchant, will not affect the 

probability of the occurrence of a transaction at merchantj: 

[1] 



IS 



[0108] The joint probability of merchant, and merchant, Is given by 
20 [0109] Substituting Pj for Pyi into equation [2] gives 



P -P P r 



[3) 



25 



30 



t01 10] However, the true probabilities P, and P, are unknown^ and so they """^^^^ ^f^";^^;^ "^"^'^ 
mation given about the data. In this scenario, the maximum lilceiihood estimate P for P,and P, » 

f5] 



35 where 



40 



45 



T is the number of co-occurrence events that merchanti appeared in, 
tJ is the number of co-occurrence events that merchantj appeared in. and 
T is the total number of co-occurrence-events. 

com] These data values are taken from the bi-directional co-occurrence matrix. Substituting these estimates into 
equation [3] produces 



T.T. [6] 



Which is -timate ^^^^^^ ^^^^^ ^ co-occurrence events in the transaction data, the expected number 



[0112] 

50 of co-occurring transactions of merchantj and merchant| is 



TJ. (71 



55 



r0113l This expected value saves as a reference point for detemnining the correlation between any two ^^r^ante 

KtransaJion date. If two me«:hants co-occur significantly greater than expected by T, the two -^'^^^^ 

it Jer^lSed. Similarly, if two merchants co-occur signlficantV less expected, the two merchants are negat««ly related. 
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Otherwise, the two merchants are practically unrelated. 

[0114] Also, given the joint probability estimate P ,y and the number of independent co-occurrence events r, the 
estimated probability distribution function for the number of times that merchantj and machantj co-occur can be deter- 
mined. It is well known, from probability theory, that an experiment having T independent trials (here transactions) and 
5 a probabtitt/ of success P /y for each trial (success here being co-occurrence of merchantj and merchantj) can be mod- 
eled using the binomial distribution. The total number of successes k, which in this case represents the number of co- 
occurrences of merchants, has the following probability distribution: 



20 



25 



35 



PKt, = * I T,T„rj) = ^^j^L-..^ .(1- , 



IS [0115] This distribution has mean: 

19) = TP, 

which is the same value as was previously estimated using a different approach. The distribution has variance: 



[10] fMM = 7PaCl-A) = ^{>-^] 



[0116] The variance is used indirectly in UDL 1, below. The standard deviation of f/y, o/y, is the square root of the 
variance Var{ti^. It merchant} and merchantj are not related, the difference between the actual and expected co-occur- 
ance counts. 7^ - T,y, should not be much larger than o^. 

c\ Desired Dot-Product s between Merchant Vectors 

[0117] To calculate the desired dot product (d^) between two merchants vectors, the UDL algorithm compares the 
number of observed co-occurrences (found in the bi-directional co-occurrence matrix) to the number of expected co- 
occurrences. First, it calculates a raw relationship measure (r^) from the co-occurrence counts, and then it calculates a 
desired dot product dfj from r^y. There are at least three different ways that the relationship strength and desired dot 
product can be calculated from the co-occurrence data: 

Method: UDL1 

[0118] 



r.= l€Zj [11] 



so 
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Method: UDL2 



[0119] 

5 r^^sign(Trj'f^)'M^ [13] 

10 

Method: UDL3 
[0120] 



75 



20 



[141 r, ^signiT, -f^)- ^2In^=,/^(r^ -f,).V2toI f/^ 



25 



''*=K^) ^^^^ 



Where Tfj is the actual number of co-occurrence events for merchant, and merchantj, and is the standard deviation of 

all the tij. 

30 [0121] In UDL2 and UDL3, the log-likelihood ratio, InX is given by: 



35 



40 



[0122] Each technique calculates the unexpected deviation, that is. the deviation of the actual co-occurrence count 
from the expected co-occurrence count In terms of the previously defined variables, the unexpected deviation Is: 



Thus, Djfy may be understood as a raw measure of unexpected deviation. 

[0123] As each method uses the same unexpected deviation measure, the only difference between each technique 
is that they use different formulas to calculate r^- from D,y. (Note that other calculations of dot product may be used). 
45 [0124] The first technique. UDL1 . defines to be the unexpected deviation D,y divided by the standard deviation of 
the predicted co-occun*ence count. This formula for the relationship measure is closely related to chi-squared (x^). a 
significance measure commonly used by statisticians. In fact 

[17] Z^^rl^^J—^ 



[0125] For small counts situations, i.e. when ffj «1 , UDL1 gives overly large values tor r,y. For example, In a typical 
retail transaction data set. which has more than 90% small counts, values of Oy on the order of 1 0^ have been seen. Data 
sets having such a high percentage of large relationship measures can be problematic; because in these cases, also 
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becomes very large. Since the same o, is used by all co-occurrence pairs, large values of a.causes ^ to become very 
small tor pairs that do not suffer from small counts. Therefore in these cases djj becomes 

tf, = tanh(^)=0 [lei 



[0126] This property is not desirable, because it forces the merchant vectors of two merchants too be orthogonal, 
even when the two merchants co-occur significantly greater than expected. 
10 [0127] The second technique, UDl-2. overcomes of the small count problem by using log-likelihood ratio estimates 
to calculate oy. It has been shown that log-likelihood ratios have much better small count behavior than x . while at the 
same time retaining the same behavior as in the non-small count regions. 

[01281 The third technique. UDL3. is a slightly modified version of UDL2. The only difference is that the log likeli- 
hood ratio estimate is scaled by 
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. This scaling removes the 



30 



bias from the log likelihood ratio estipiate. The preferred embodiment uses UDL2 In most cases. 
[0129] Accordingly, the present invention generally proceeds as follows: 

^ . For each pair of root merchant names, detemiine the expected number of co-occurrences of the pair from total 
number of co-occun-ence transactions involving each merchant name (with any merchant) and the total number of 
co-occurrence transactions. 

2. For each pair of root merchant names, determine a relationship strength measure based on the difference 
between the expected number of co-occurrences and the actual number of co-occurrences. 

3. For each pair of root merchant names, determine a desired dot product between the merchant vectors from the 
35 relationship strength measure. 

d) Merchant Vector Training 

[0130] The goal of vector training is to position the merchant vectors in a high-dimensional vector space such that 
40 the dot products between them closely approximates their desired dot products. (In a preferred embodiment, the vector 
space has 280 dimensions, though more or less could be used). Stated more fonmally: Given a set of merchant vectors 
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and the set of desired dot products for each pair of vectors 

position each merchant vector such that a cost function is minimized, e.g: 
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Llo!e vectors. means thatnitdesiredt^^^^^^ 

sand or more high-dimensional miJi^tzing the cost function are preferred. 

,„ Which the infomiation is the desired dot product is compared to 

[01321 one such approach is based on gradient '^^''^J^^^ less than desired. 

-:^~~=E~r=::rjr=: 

[20] P,(n+l) = *^,(n)+«(*'ff->^i*yXy 
F,(n+l) 

(221 V,(n*Vi=f,lf)*t^,-^,-^3ri 

1231 y^''*'^-f;^ 
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[0133, This technique converges as io.g as the 

of the particulartransaction data being used: typ-cally '^'^^ZZnx l^^ Z^Ben^o<iim^X. the desired position of 
[01341 An alternative methodology uses averages of -"^'f J^^f^^ '"'^^^ the current position of the 
[ Jem merchant vector ^^^--'"f j"*; ^^^^^^^^^ An e'rror weighted 

^gToTrsrdt-r^^^^^^^^ 

Written in terms of vector equations, the update rule is: 

124] - 1) = (1 - rF, + rt^A^ - ^' • 



i«i e.J'-^fKA", 




*are * « deslrea « proauc. between V, and V, e™. I. me cnen. do, produc. bemeen », and V, 
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[0135] Since is a linear combination of merchant vectors Vf and Vj, it wilt always be In the plane of these vectors 
Vf and Vj, 

[0136] D^e result ot any of these various approaches is a final set of merchant vectors for all merchant names. 
[0137] Appendix II below, provides a geometrically derived algorithm forthe en^or weighted update process. Appen- 
dix 111 provides an algebraically derived algorithm of this process, which results in an efficient code implementation, and 
which produces the same results as the algorithm of Appendix II. 

[0138] Those of skill in the art will appreciate that the UDL algorithm, including its variants above, and the imple- 
mentations in the appendices, may be used in contexts outside of determining merchant co-occurrences. This aspect 
of the present invention may be for vector representation and co-occurrence analysis In any application domain, for 
example, where there is need for representing high frequency data items without exclusion. Thus, the ULD algorrthm 
may be used in information retrieval, document routing, and other fields of infomiation analysis. 

3, Clustering Module 

[0139] Following generation and training of the merchant vectors, the clustering module 520 is used to cluster the 
resulting merchant vectors and identify the merchant segments. Various different clustering algorithms may be used, 
including k*means clustering (MacQueen). The output of the clustering is a set of merchant segment vectors, each 
being the centroid of a merchant segment, and a list of merchant vectors (thus merchants) included in the merchant 
segment. 

[0140] There are two different clustering approaches that may be usefully employed to generate the merchant seg- 
ments. First, clustering may be done on the merchant vectors themselves. This approach looks for merchants having 
merchant vectors which are substantially aligned in the vector space, and clusters these merchants into segments and 
computes a cluster vector for each segment. Thus, merchants for whom transactions frequently co-occur and have high 
dot products between their merchant vectors will tend to form merchant segments. Note that it is not necessary for all 
merchants in a cluster to alt co-occur in many consumers' transactions. Instead, co-occurrence is associative: if mer- 
chants A and B co-occur frequently, and merchants B and C co-occur frequently, A and C are likely to be in the same 
merchant segment. 

[0141] A second clustering approach is to use the consumer vectors. For each account identifier, a consumer vec- 
tor is generated as the summation of the vectors of the merchants at which the consumer has purchased in a defined 
time interval, such as the previous three morrths. A simple embodiment of this is: 

N 

C = V^,- [26] 

where C is the consumer vector for an account. N is the number of unique root merchant names in the customer 
account's transaction data within a selected time period, and V| is the merchant vector forthe i^ unique root merchant 
name. The consumer vector is then normalized to unit length. 

[0142] A more interesting consumer vector takes into account various weighting factors to weight the significance 
of each merchant's vector: 

N 

J^W,V. [27] 



where Wj is a weight applied to the merchant vector V-,. For example, a merchant vector may be weighted by the total 
(or average) purchase amount by the consumer at the merchant in the time period, by the time since the last purchase, 
by the total number of purchases in the time period, or by other factors. 

[0143] However computed, the consumer vectors can then be clustered, so that simitar consumers, based on their 
purchasing behavior, form a merchant segment. This defines a merchant segment vector. The merchant vectors which 
are closest to a particular merchant segment vector are deemed to be included in the merchant segment. 
[0144] VSTith the merchant segments and their segment vectors, the predictive models for each segment may be 
developed. Before discussing the creation of the predictive models, a description of the training data used in this proc- 
ess is described. 
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p , pATA POSTPROCESSING MODULE 

[0145] Following identification of merchant segments, a predictive model of consumer spending in each segment is 
generated from past transactions of consumers in the merchant segment. Using the past transactior^ of consumer in 
the merchant segment provides a robust base on which to predict fixture spending, and since the merchant segments 
were identified on the basis of the actual spending patterns of the consumers, the aitoitrariness of conventional demo- 
graphic based predictions are minimized. Additional non-segment specific transactions of the consumer may also be 
used to provide a base of transaction behavior. 

[0146] To create the segment models, the consumer transaction data is organized into groups of observations. 
Each observation is associated with a selected end-date. The end*date divides the observation into a prediction window 
and an input window. The input window includes a set of transactions in a defined past time interval prior to the selected 
end-date (e.g. 6 months prior). The prediction window includes a set of transactions in a defined time interval after the 
selected end-date (e.g. the next 3 months). The prediction window transactions are the source of the dependent varia- 
bles for the prediction, and the input window transactions are the source of the independertt variables for the prediction. 
[0147] More particularty, the input for the observation generation module 530 are the master files 408. The output 
is a set of observations for each account. Each account receives three types of observations. Fig. B illustrates the 
observation types. 

[0148] The first type of observations are training observations which are used to train the predictive models that 
predicts future spending within particular merchant segments. If N is the length (in months) of the window over which 
observation inputs are computed ten there are 2A/-1 training observations for each segment. 

[0149] In Rg. 8, there are shown a 1 6 months of transaction data, from March of one year, to June of the next Train- 
ing observations are selected prior to the date of irtterest November 1 . The input window includes the 4 months of past 
data to predict the next 2 months in the prediction window. The first input window 802a thus uses a selected date of July 
1 , includes March-June to encompass the past transactions; transactions in July-August form the prediction window 
803a. The next input window 802b, uses August 1 as the selected date, with transactions in April-July as the past trans- 
actions. August-September as prediction window 803b. The last input window for this set is 802d, which uses November 
1 as its selected date, with an prediction window 803d of observations in November-December, 
[01 50] The second type of observations are blind observations. Blind observations are observations where the pre- 
diction window does not overtap any of the time frames for the prediction windows in the training observations. Blind 
observations are used to evaluate segment model performance. In Fig. 8. the blind observations 804 include those frorri 
September to February, as illustrated. 

[0151] The third observation type is action observations, which are used in a production phase. Action observations 
have only inputs (past transactions given a selected date) and no target transactions after the selected date. These are 
preferably constructed with an input window that spans the final months of available data. These transactions are the 
ones on which the actual predictions are to be made. Thus, they should be the transactions in an input window that 
extends from a recent selected date (e.g most recent end of month), back the length of the input window used during 
training. In Fig. 8. the action observations 806 span November 1 to end of February, with the period of actual prediction 
being from March to end of May. 

[0152] Fig. 8 also illustrates that at some point during the prediction window, the financial institution sends out pro- 
motions to selected consumers based on their predicted spending in the various merchant segments. 
[0153] Refening to Rg. 4b again, the DPPM takes the master files 408, and a given selected end-date, and con- 
structs for each consumer, and then for each segment, a set of training observations and blind observations from the 
consumer's transactions, including transactions in the segment, and any other transactions. Thus, if there are 300 seg- 
ments, for each consumer there will be 300 sets of observations. If the DPPM is being used during production for pre- 
diction purposes, then the set of observations is a set of action observations. 

[0154] For training purposes, the DPPM computes transactions statistics from the consumer's transactions. The 
transaction statistics serve as independent variables in the input window, and as dependent variables from transactions 
in the prediction window. In a preferred embodiment, these variables are as follows: 

[0155] Prediction window: The dependent variables are generally any measure of amount or rate of spending by 
the consumer in the segment in the prediction window. A simple measure is the total dollar amount that was spent in 
the segment by the consumer in the transactions in the prediction window. Another measure may be average amount 
spent at merchants (e.g. total amount divided by number of transactions). 

[0156] Input window: The independent variables are various measures of spending in the input window leading up 
to the end date (though some may be outside of it). Generally, the transaction statistics for a consumer can be extracted 
from various grouping of merchants. These groups may be defined as: 1) merchants in all segments; 2) merchants in 
the merchant segment being modeled; 3) merchants whose merchant vector is closest the segment vector for the seg- 
ment being modeled (these merchants may or may not be in the segment); and 4) nnerchants whose merchant vector 
is closest to the consumer vector of the consumer. 
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[01 57] One preferred set of Input variables includes: 

(1) Recency The amount of time In months between the current end date and the most recent transaction of the 
consumer In any segment Recency may computed over all available time and is not restricted to the input window. 

(2) Frequency. The number of transactions by a consumer In the input window preceding the end-date for all seg- 

(S) Monetary value of purchases. A measure of the amount of dollars spent by a customer in the input window pre- 
ceding the end-date for all segments. The total or average, or other measures may be used. 

(4) Recency.segment The amount of tme in months between the cun^nt end date and the most recent transac- 
tion of the consumer In the segment. Recency may be computed over all available time and is not restncted to the 

(5) Frequen^.segment. The number of transactions in the segment by a customer In the Input window preceding 

the current end date. .... ^ . ^ ^ 

(6) Monetary.segment. The amount of dollars spent in the segment by a customer in the input window preceding 

J5 the current end date. ^ . ^ ^ . j .w _ . 

(7) Recency nearest profile merchants. The amount of time in months between the current end date and the most 
recent transaction of the consumer in a collection of merchants that are nearest the consumer vector of the con- 
sumer Recency may be computed over all available time and is not restricted to the input window. 

(8) Frequency nearest profile merchants. The number of transactions in a collection of merchants that are nearest 
the consumer vector of the consumer by the consumer in the input window preceding the current end date. 

(9) Monetary nearest frequency merchants. The amount of dollars spent in a collection of merchants that are near- 
est the consumer vector of the consumer by the consumer in the Input wrtndow preceding the current end date. 

(10) Recency nearest segment merchants. The amount of time in months between the current end date and the 
most recent transaction of the consumer in a collection of merchants that are nearest the segment vector. Recency 

25 may be computed over all available time and is not restricted to the input window. 

(11) Frequency nearest segment merchants. The number of transactions in a collection of merchants that are near- 
est the segment vector by the consumer in the input window preceding the current end date. 

(12) Monetary nearest segment merchants. The amount of dollars spent In a collection of merchants that are near- 
est the segment vector by the consumer in the Input window preceding the cun-ent end date. 

(13) Segment probability score. The probability that a consumer will spend in the segment in the prediction window 
given all merchant transactions for the consumer in the input window preceding the end date. A preferred algonthm 
estimates combined probability using a recursive Bayesian method. 

(14) Seasonality variables. It is assumed that the fundamental period of the cyclic component is known. In the case 
of seasonality, it can be assumed that the cycle of twelve months. Two variables are added to the model related to 
seasonality. The first variable codes the sine of the date and the second variable codes the cosine of the date. The 
calculation for these variables are: 

Sin Input = sin( 2.0 * PI* (sample day of year) /365) 
40 Cos Input = cos( 2.0 * PI* (sample month of year) / 365). 

(15) (Segment Vector-Consumer Vector Closeness: As an optional input, the dot product of ttie segment vector for 
the segment and the consumer vector Is used as an input variable. 

45 [01581 In addition to these transaction statistics, variables may be defined for the frequency of purchase and mon- 
etary value tor all cases of segment merchants, nearest profile merchants, nearest segment merchants for the same 
forward prediction window in the previous year(s). 



30 



35 



SO 



55 



PRFDICTIV F MODEL GENERATION 

[01 591 The training observations for each segment are input into the segment predictive model generation module 
530 to generate a predictive model for the segment. Fig. 9 illustrates the overall logic of the predictive model generation 
process The master files 408 are organized by accounts, based on account identifiers, here illustratively accounts 1 
through N. There are M segments, indicated by segments 1 through M. The DPPM generates for each combination of 
account and merchant segment, a set of input and blind observations. The respective observations for each merchant 
segment M from the many accounts 1...N are input into the respective segment predictive model M during training. 
Once trained, each segment predictive model is tested with the corresponding blind observations. Testing may be done 
by comparing for each segment a lift chart generated by the training observations with the lift chart generated from blind 



26 



EP 1 050 833 A2 



observations. Uft charts are further explained below. 

[0160] The predictive model generation module 530 is preferably a neural network, using a conventional multi-layer 
organization, and backpropagation training. In a preferred enobodiment, the predictive model generation model 530 is 
provided by HNC Software's Database Mining Workstation, available from HNC Software of San Diego. California. 
[0161] NAThile the preferred embodiment uses neural networks for the predictive models, other types of predictive 
models may be used. For example, linear regression models may be used. 

H, PROFILING ENGINE 

[0162] The profiling engine 412 provides analytical data in the form of an account profile about each customer 
whose data is processed by the system 400. The profiling engine is also responsible for updating consumer profiles 
over time as new transaction data for consumers is received. The account profiles are objects that can be stored in a 
database 41 4 and are used as Input to the computational components of system 400 in order to predict future spending 
by the customer in the merchant segments. The profile database 414 is preferably ODBC compliant, thereby allowing 
the accounts provider (e.g. financial institution) to import the data to perform SQL queries on the customer profiles. 
[0163] The account profile preferably includes a consumer vector, a membership vector describing a membership 
value for the consumer for each merchant segment, such as the consumer's predicted spending in each segment in a 
predetermined future time interval, and the recency, frequency, and monetary variables as previously described for pre- 
dictive model training. 

[0164] The profiling engine 412 creates the account profiles as follows. 

I. Membersh ip Function: Predicted SoendinQ In Each Segment 

[0165] The profile of each account holder includes a membership value with respect to each segment. The mem- 
bership value is computed by a membership function. The purpose of the membership function is to identify the seg- 
ments with which the consumer is mostly closely associated, that is. which best represent the group or groups of 
merchants at which the consumer has shopped, and is likely to shop at in the future. 

[0166] In a preferred embodiment, the membership function computes the membership value for each segment as 
the predicted dollar amount that the account holder will purchase in the segment given previous purchase history. The 
dollar amount is projected for a predicted time interval (e.g. 3 months fon^^ard) based on a predetemnined past time 
inten/al (e.g. 6 months of historical transactions). These two time intervals correspond to the time intervals of the input 
window and prediction windows used during training of the merchant segment predictive models. Thus, if there are 300 
merchant segments, then a membership value set is a list of 300 predicted dollar amounts, con-esponding to the 
respective merchant segments. Sorting the list by the membership value identifies the merchant segments at which the 
consumer is predicted to spend the greatest amounts of money in the future time interval, given their spending histori- 
cally. 

[0167] To obtain the predicted spending, certain data about each account is input in each of the segment predictive 
models. The input variables are constructed for the profile consistent with the membership function of the profile. Pref- 
erably, the input variables are the same as those used during model training, as set forth above. An additional input var- 
iable for the membership function may include the dot product between the consumer vector and the segment vector 
for the segment (it the models are so trained). The output of the segment models is a predicted dollar amount that the 
consumer will spend in each segment in the prediction time interval. 

2. Segment Membership Based on Consumer Vectors 

[0168] A second alternate, membership aspect of the account profiles is membership based upon the consumer 
vector for each account profile. The consumer vector is a summary vector of the merchants that the account has 
shopped at as explained above with respect to the discussion of clustering. In this aspect, the dot product of the con- 
sumer vector and segment vector for the segment defines a membership value. In this embodiment, the membership 
value list is a set of 300 dot products, and the consumer is member of the merchant segment{s) having the highest dot 
product(s). 

[0169] With either one of these membership functions, the population of accounts that are members of each seg- 
ment (based on the accounts having the highest membership values for each segment) can be determined. From this 
population, various summary statistics about the accounts can be generated such as cash advances, purchases, deb- 
its, and the like. This information is further described below. 
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3-' Updating of Consumer Profiles 

roi70] AS additional transactions of a consumer are received periodically (e.g. each month) the merchairt vectors 
associated with the merchants in the new transactions can be used to update the consumer vector, preferably using 
averaaing techniques, such as exponential averaging over the desired time interval for the update 
averaging t^j^ ^ . ^^^^^^ ^^^^ ^^^^^ ^ ^^^^^ ^^^^ p^^^^ '^'^^^k ° I T^r 

he dollais spent at the merchant. Thus, merchant vectors are weighted in the new transaction P«"*^^f ^ f f/^^^^^^ 
and the sigrfrficance of transactions for the merchant by the consumer {e.g. weighted by dollar amount of transactions 
by consumer at merchant). One formula for weighting merchants is: 

W,= S,e''' 128] 



where 
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W, is the weight to be applied to merchant f s merchant vector; 
S, is the dollar amount of transactions at merchant / in the update time interval; 
f is the amount of time since the last transaction at merchant /; and 
X is a constant that controls the overall influence of the merchant 

10172] The profiling engine 412 also stores a flag for each consumer vector indicating the time of the last update. 
I RPPORTING ENGINE 

[01731 The reporting engine 426 provides various types of segment and account specific reports. ^^^^^^^ 
generated by queuing the profiling engine 412 and the account database for the segments and associated accounts, 
and tabulating various statistics on the segments and accounts. 

1 pasic Reporting Functionality 

30 * 

[0174] The reporting engine 426 provides functionality to: 

a) Search by merchant names, including raw merchant names, root names, or equivalence names. 

b) Sort merchant lists by merchant name, frequency of transactions, transaction amounts and volumes, number of 
35 transactions at merchant or SIC code. 

c) Filter contents of report by number of transactions at merchant. 

[0175] The reporting engine 426 provides the following types of reports, responsive to these input criteria: 
40 ? General S aqment Report 

101761 For each merchant segment a very detailed and powerful analysis of the segment can be aeated in a seg- 
ment report. This information includes: 

45 General S egment Information 

10177] Merchant Cohesion: A measure of how closely clustered are the merchant vectors in ^/'^^segment TOs is 
the average of the dot products of the merchant vectors with the centroid vector of this segment. Higher numbers indi- 

S78f*''NuXTof Transactions: The number of purchase transactions at merchants in this segment, relative to the 
total number of purchase transactions in all segments, providing a measure of how significant the segment is in trans- 

action vplume.^^^^ Spent: The total dollar amount spent at merchants in this segment, relath/e to the total dollar amount 
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spent in all segments, providing a measure of dollar volume for the segment. Thte i«i 

[01801 Most Closely Related Segments: A fet of other segments that are closest to the ^^J^^'l^^^fJ^'^^ 
may be ranked by the dot products of the segment vecto-s. or by a measure of the conditional probab.lrty of purchase 

in the other segment given a purchase in the cun-ent segment. .... „, a o«.„„ont 

[01811 The conditional probability measure M te as follows: P(AIB) is probability of purchase m segment A segment 
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h) Rfi qmen^ Members Information 

10183] Detailed information is provided about each merchant which is a member of a segment This information 

comprises: 

E elShriie l^mSon^^ ai. the money spent in this segment that is spent at this merchant (percent,; 
Number of transactions: The number of purchase transactions at merchant; 

value of 1 .0. indicates that the merchant vector is at the centrold); 
SIC Description: The SIC code and Its description; 

[0184] This information may be sorted along any of the above dimensions. 
Lift Chart 

10185] A lift Chart useful tor validating the perfomiance of the predictive models by comparing predicted spending 

in a predicted time window with actual spending. 

[01 86] Table 1 0 illustrates a sample lift chart for merchant segment: 
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A sample segment lift chart. 



Bin 


Cumulative segment lift 


Cumulative segment lift 
in$ 


Cumulative Population 


1 


5.56 


$109.05 


50.000 


2 


4.82 


$94.42 


100.000 


3 


3.82 


$74.92 


150.000 


4 


3.23 


$63.38 


200.000 


5 


2.77 


$54.22 


250.000 


6 


2.43 


$47.68 


300.000 


7 


2.20 


$43.20 


350.000 


8 


2.04 


$39.98 


400.000 


9 


1.88 


$36.79 


450.000 


10 


1.75 


$34.35 


500,000 


11 


1.63 


$31.94 


550.000 


12 


1.52 


$29-75 


600.000 


13 


1.43 


$28.02 


650.000 


14 


1.35 


$26.54 


700.000 


15 


1.28 


$25.08 


750,000 


16 


1.21 


$23.81 


800.000 
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Table 10 (continued) 



A sample segment lift chart 


Bin 


Cumulative segment lift 


Cumulative segment lift 
in $ 


Cumulative Population 


17 


1.16 


$22.65 


850.000 


18 


1.10 


$21 .56 


900,000 


19 


1.05 


$20.57 


950.000 


20 


1.00 


$19.60 


1.000,000 


Base-line 




$19.60 





75 [0187] Lift charts are created generally as follows: 

[0188] As before, there is defined input window and prediction window, for example 6 and 3 months respectively. 
Data from the total length of these windows relative to end of the most recent spending data available is taken. For 
example, if data on actual spending in the accounts is available through the end of the current month, then the prior 
three months of actual data will be used as the prediction window, and the data for the six months prior to that will be 

20 data for input window. The input data is then used to 'predicT spending in the three month prediction window, for which 
in fact there is actual spending data. The predicted spending amounts are now compared with the actual amounts to 
validate the predictive models. 

[0189] For each merchant segment then, the consumer accounts are ranked by their predicted spending for the 
segment in the prediction window period. Once the accounts are ranked, they are divided into.N (e.g. 20) equal sized 
25 bins so that bin 1 has the highest spending accounts, and bin N has the lowest ranking accounts. This identifies the 
accounts holders that the predictive model for the segment indicated should be are expected to spend the most in this 
segment. 

[0190] Then, for each bin, the average actual spending per account In this segment in the past time period, and the 
average predicted spending is computed. The average actual spending over all bins is also computed. This average 

30 actual spending for all accounts is the baseline spending value (in dollars), as illustrated in the last line of Table 1 0. This 
number describes the average that all account holders spent in the segment in the prediction window period. 
[0191] The lift for a bin is the average actual spending by accounts in the bin divided by the baseline spending 
value. If the predictive model for the segment is accurate, then those accounts in the highest ranked bins should have 
a lift greater than 1, and the lift should generally be increasing, with bin 1 having the highest lift. Where this the case, 

35 as for example, in Table 10, in bin 1 . this shows that those accounts in bin 1 in fact spent several times the baseline, 
thereby confirming the prediction that these accounts would in fact spend more than others in this segment. 
[01 92] The cumulative lift for a bin is computed by taking the average spending by accounts in that bin and all higher 
ranking bins, and dividing it by the baseline spending (i.e. the cumulative lift for bin 3 is the average spending per 
account in bins 1 through 3. divided by the baseline spending.) The cumulative lift for bin N is always 1 .0. The cumula- 

40 tive lift is useful to identify a group of accounts which are to be targeted for promotional offers. 

[0193] The lift information allows the financial institution to very selectively target a specific group of accounts (e.g. 
the accounts in bin 1) with promotional offers related to the merchants in the segment This level of detailed, predictive 
analysis of very discrete groups of specific accounts relative to merchant segments is not believed to be currently avail- 
able by conventional methods. 

d> Population Statistics Tables 

[0194] The reporting engine 426 further provides two types of analyses of the financial behavior of a population of 
accounts that are associated with a segment based on various selection criteria. The Segment Predominant Scores 
so Account Statistics table and the Segment Top 5'.o Scores Account Statistics table present averaged account statistics 
for two different types of populations of customers wno shop, or are likely to shop, in a given segment. The two popula- 
tions are determined as follows. 

[0195] Segment Predominant Scores Account Statistics Table : All open accounts with at least one purchase trans- 
action are scored (predicted spending) for all of the segments. Wilhrn each segment, the accounts are ranked by score, 
55 and assigned a percentile ranking. The result is that for each account there is a percentile ranking value for each of the 
merchant segments. 

[0196] The population of interest for a given segment is defined as those accounts which have their highest percen- 
tile ranking in this segment. For example, if an account has its highest percentile ranking in segment #108, that account 
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will be included in the population for the statistics table for segment #108, but not in any other segnnent This approach 
assigns each account hotder to one and only one segment. 

J0197] Segment Top S% Scores Account Statistics. For the Segment Top 5% Scores Account Statistics table, the 
population is defined as the accounts with percentile ranking of 95% or greater in a cun-ent segment These are the 5% 

5 of the population that is predicted to spend the most in the segment in the predicted future time interval following the 
input data time window. These accounts may appear in this population in more than one segment, so that high spend- 
ers will show up in many segments; concomitantly, those who spend very Prttle may not assigned to any segment 
[0198] The number of accounts in the population for each table is also determined and can be provided as a raw 
number, and as a percentage of all open accounts (as shown in the titles of the following two tables). 

10 [0199] Table 1 1 and Table 12 provide samples of these two types of tables: 



Table 1 1 



Segment Predominant Scores Account Statistics: 8291 accounts (0.17 percent) 


Category 


Mean Value 


Std Deviation 


Population Mean 


Relative Score 


Cash Advances 


$11.28 


$53.18 


$6.65 


169.67 


Cash Advance Rate 


0.03 


0.16 


0.02 


159.92 


PurchEises 


$166.86 


$318.86 


$192.91 


86.50 


Purchase Rate 


0.74 


1.29 


1.81 


40.62 


Debits 


$ 178.14 


$324.57 


$199.55 


89.27 


Debit Rate 


0.77 


1.31 


1.84 


41 .99 


Dollars In Segment 


4.63 


14.34 


10.63% 


43.53 


Rate in Segment 


3.32 


9.64 


1 1 .89% 


27.95 
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Segment Top 5% Scores Account Statistics: 154786 accounts (3.10 percent) 


Category 


Mean Value 


Std Deviation 


Population Mean 


Relative Score 


Cash Advances 


$9.73 


$51.21 


$7.27 


133.79 


Cash Advance Rate 


0.02 


0.13 


0.02 


125.62 


Purchases 


$391.54 


$693.00 


$642.06 


60.98 


Purchase Rate 


2.76 


4.11 


7.51 


36.77 


Debits 


$401.27 


$702.25 


$649.34 


61.80 


Debit Rate 


2.79 


4.12 


7.53 


37.00 


Dollars in Segment 


1.24 


8.14 


1 .55% 


80.03 


Rate in Segment 


0.99 


6.70 


1.79% 


55.04 



50 \) Segment Statistics 

[0200] The tables present the following statistics for each of several categories, one category per row. The statistics 
are: 

55 Mean Value: the average over the population being scored; 

Std Deviation: the standard deviation over the population being scored; 

Population Mean: the average, over all the segments, of the Mean Value (this column is thus the same for all seg- 
ments, and are included for ease of comparison); and 
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Relative Score: the Mean Value, as a fraction of the Population Mean fm percent). 
ir> Row Descriptions 

5 [0201] Each table contains rows for spending and rate in Cash Advances, Purchases, Debits, and Total Spending. 

• The rows for spending (Cash Advances, Purchases, and Debits) show statistics on dollars per month for all 
accounts in the population over the time period of available data. 

• The rate rows (Cash Advance Rate. Debit Rate, and Purchase Rate) show statistics on the number of transactions 
10 per month for all accounts in the population over the time period of available data. 

• Debits consist of Cash Advances and Purchases. 

• The Dollars in Segment shows the fraction of total spending that is spent in this segment This informs the financial 
institution of how significant overall this segment is. 

• The Rate in Segment shows the fraction of total purchase transactions that occur in this segment. 

15 

[02021 The differences between these two populations are subtle but Important, and are illustrated by the above 
tables. The segment predominant population identifies those individuals as members of a segment who. relative to their 
own spending, are predicted to spend the most in the segment For example, assume a consumer whose predicted 
spending in a segment is $20,00. which gives the consumer a percentile ranking of 75^ percentile. If the consumer's 
20 percentile ranking in every other segment Is below the 75* percentile, then the consumer is selected in this population 
for this segment. Thus, this may be considered an intra-account membership function. 

[0203] The Top 5% scores population instead Includes those accounts holders predicted to spend the most In the 
segment, relative to all other account holders. Thus, the account holder who was predicted to spend only $20.00 in the 
merchant segment will not be member of this population since he is well below the 96*^ percentile, which may be pre- 

25 dieted to spend, for example $1 00.00. 

[0204] In the example tables these differences are pronounced. In Table 1 1 , the average purchases of the segment 
predominant population is only $166.86. In Table 12. the average purchase by top 5% population is more than twice 
that, at $391 .64. This infomnation allows the financial institution to accurately identify accounts which are most likely to 
spend in a given segment, and target these accounts with promotional offers for merchants in the segment 

30 [0205] The above tables may also be constructed based on other functions to identify accounts associated with 
segments, including dot products between consumer vectors and segment vectors. 

J.TARGETING ENGINE 

35 [0206] The targeting engine 422 allows the financial institution to specify targeted populations for each (or any) 
merchant segment, to enable selection of the targeted population for receiving predetermined promotional offers. 
[0207] A financial institution can specify a targeted population for a segment by specifying a population count for 
the segment, for example, the top 1000 accounts holders, or the top 10% account holders in a segment The selection 
is made by any of the membership functions, including dot product, or predicted spending. Other targeting speciflca- 

<o tions may be used in conjunction with these criteria, such as a minimum spending amount in the segment such as 
$1 00. The parameters for selecting the targeting population are defined in a target specification document 424 which is 
an input to the targeting engine 422. One or more promotions can be specifically associated with certain merchants in 
a segment, such as the merchants with the highest correlation with the segment vector, highest average transaction 
amount, or other selective criteria. In addition, the amounts offered in the promotions can be specific to each consumer 

45 selected, and based on their predicted or historical spending in the segment The amounts may also be dependent on 
the specific merchant for whom a promotion is offered, as a function of the merchant's contributions to purchases in the 
segment such as based upon their dollar bandwidth, average transaction amount or the like. The selected accounts 
can be used to generate a targeted segmentation report 430 by providing the account identifiers for the selected 
accounts to the reporting engine 426. which constructs the appropriate targeting report on the segment This report has 

50 the same format as the general segment report but is compiled for the selected population. 
[0208] An example targeting specification 424 is shown below: 
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values between two selected time periods, typically using data in a most recent prediction window (either ending or 
beginning with a current statement date) relative to memberships in prior time intervals. The financial institution can 
define a threshold change value for selecting accounts with changes in membership more significant than the thresh- 
old. The selected accounts may then be provided to the reporting engine 426 for generation of various reports, including 
5 a segment transition report 432 which is like the general segment report except that it applies to accounts that are con- 
sidered to have transitional to or from a segment. This further enables the financial institution to selectively target these 
customers with promotional offers for merchants in the segments in which the consumer had the most significant posi- 
tive increases in membership. 

[0217] In summary then, the present invention provides a variety of powerful analytical methods which predict con- 
to sumer financial behavior in discretely defined merchant segments, and with respect to predetenmined time intervals. 
The clustering of merchants in merchant segments allows analysis of transactions of consumers in each spedfic seg- 
ment, both historically, and in the predicted period to identify consumers of interest Identified consumers can then be 
targeted with promotional offers precisely directed at merchants within specific segments. 
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Appendix I and II are part of the spectf»cat*(on 
[0218] 

APPENDIX I: W-cyram Macchinq Algorithm 

1. A set of training exan^les ie presented to the algorithm. In 
this case, the training exanples are all the merchant names that 
are being processed. 

2. Bach training example is broken down into all possible n-grams, 
for a selected value of n (n-3 for trigrams) E.g, the merchant 
name **wal-mart" yields the trigrams ^'^w, *wa, wal, al-. 1-m, - 
ma, mar, art, rt", f , where " is an **end of string" token . 

3. The frequencies with which each trigram appears anywhere in the 
training examples are counted. 

4. In the preferred embodiment, each trigram is assigned a weight, 
given by 



"-At) 



where jcyr indicates the particular trigram, is the number of 
times the trigram appeared anywhere in the training exan^jles, 
andNis the maximum value of F f or all trigrams. Thus, 
frequently occurring trigrams are assigned low weights, while 
rare trigrams are assigned high weights. Other weighting 
schemes, including uniform weights, are possible. 

5. A high dimensional vector space is constructed, with one 
dimension for each trigram that appears in the set of training 
examples . 

6. To compare two particular strings of characters (merchant 
names), stringl and string2/ each string is represented by a 
vector in the vector space. The vector for a stringl ie 
constructed by: 

a) counting the frequency of each trigram in the string, 

b) assembling a weighted sum of unit vectors, 

%rtxere xyx ranges over all trigrams in stringl, and Wj^, is a 
unit vector in the direction of the xyz dimension in the 
vector space. 

c) normalizing K^.to length a length of one (preferred 
embodiment) , or utilizing another normalization, or provxding 
no normalization at all . 



35 



EP 1 050 833 A2 



d) construct the similar vector corresponding to the other 
string, f^^irkr? 

e) take the dot product of K^^and K,^, . A high dot product 
(near one) indicates that the two strings are closely related, 
while a low dot product (near zero) indicates that the two 
strings are not related, 

. Two merchant names are equivalenced if their vectors' dot 
product is greater than a particular threshold. This threshold 
is typically in the range of 0.6 to 0.9 for the preferred 
embodiment • 
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APPENDIX II; Geometrically Derived V ^^eter Training Algorithm 
Initialize: 

For each stem, x € (all stons in coxpus} 

rand ^vector // random vector for stem i 
formalize V^to length 1 

^f?-5^ //zero initialized update vector for stem i 

END 

For each stem, / e {all stems in coipus} 
Calculate Updates: 

For each stem, j e {all stems that co -occurred with stem i), / 

We wish to calculate a new vector, C/^,, that is the 
ideal position of V, with respect to . In other 
words, we want the dot product of t/^with Vj to be 

we want Unto have unit length, and we want 
to lie in the plane defined by and Vj , 

^.^.^^ //vector difference between vectors for 
stems j and i. 

S^D^fj'doi(Vj,D) //e is vector of con^onents of D 
which are orthogonal to Vj . This defines a plane 
between Vj and 9 in which Vi lies. 

e = //normalize 8 

/«,^lZ!iL.j //I is weight for 9 

IF >0 THEN //if positive relationship between 
stems j and i 

ELSE IF dg<0 THEN //if negative relationship 
END IF 

//normalize 
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We construct a weighted sum of the Ugtor all j to 
derive an estimate of where should be. 

IF welght^mode LOG_FREQ THEN 

AV, = AV, + 17, . [l • doi{u,. kJ. [l + logFt/I 

ELSE IF welght^mode «« FREQ THEN 

A = AF, + C?, ■ [l - dot(u,, P, )l Fif] 

ELSE 
END IF 

END J 

Perform Update: 

V,^ = {l^gamma)-Pf -¥ gamma- AP, 




END i 
NOTES : 

1) Stems here are root merchant names. 

2) The list of stems j (merchant names) which co-occur with stem i 
is known from the co-occurrence data. 

3) dij is relationship strength measure, calculated by ODLl, DDL2, 
or 0DL3. 

4) Fiji is the frequency at which stem j appears in the data. 

5) Weight_mode is a user controlled value that determines the 
influence that F(j) has on the 0. If weight_mode is FREQ then 
the frequency of stem j directly effects 0, so that higher 
frequency stems (merchant names) strongly influence the 
resulting merchant vector of merchant i. A slower influence is 
provided by weight_mode «= LOG FREQ, which uses the log of Ftjl. 
If weight mode is not set, then the default is no influence by 
P(jl- 

6) Gamma is a learning rate 0-1, typically 0.5 to 0.9 
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APPENDIX m: Algd^raically Derived Vector Training Algorithm 
Initialize: 

For each stem, i e {all stems in corpus} . 

Pf ta rand ^vector // initialize a random vector for stem i 
Normalize //normalize vector to unit length 

Af^i=5, //define a zero initialized update vector for 
stem i 

For each stem, i € {all stems in corpus} 
Calculate Updates: 

For each stem, j € {all stems that co - occurred with stem i), j ^ i 
// this is all merchants j which co-occur with 
merchant i 

We wish to calculate a new vector, C/^, that is the 
ideal position of with respect to Vj . In other 
words, we want the dot product of C/^with Vj to be 
dg, we want U^to have unit length, and we want 
to lie in the plane defined by V, and Vj . 

Ug can be expressed as a linear combination of 
and Vj where: 

He construct a weighted sum of the U^tox all j to 
derive an estimate of where should be. 
IF uelght^modB — LOG^FREQ THEN 

AF, » A^;+a^-[l-rfor(c/^.Fj]-[l + logF[/II 
ELSE IF weigh t_mode — FREQ THEN 

AF, - AF,+C7, (i-^*(^7,.K,)]-FL/1 
ELSE 

AF, = AF, -[l-rfM^tf.^^Jl 
END IF 
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END j 

Perform Update: 

y/^ (I ~ gamma) Vf gamma AVi 

END i 



Notes: 

1) Stems here are root merchant names. 

2) The list of stems j (merchant names) which co-occur with stem i 
is known from the co-occurrence data. 

3) dij is relationship strength measure, calculated by UDLl, UDL2, 

or UDL3. . w -4 * 

4) Ftjl is the frequency at which stem j appears in the data. 

5) Weight mode is a user controlled value that determines the 
influence that Fiji has on the U. If weight_mode is FREQ then 
the frequency of stem j directly effects 0, so that higher 
frequency stems (merchant names) strongly influence the 
resulting merchant vector of merchant i. A slower influence xs 
provided by weight_mode «= LOG FREQ, which uses the log of F[d3 . 
If weight^mode is not set, then the default is no influence by 
FC j) . 

6) Gamma is a learning rate 0-1, typically 0.5 to 0.9 



Claims 

1. A method of predicting financial behavior of consumers, comprising: 

generating from transaction data for a pluralrty of consumers, a date ordered sequence of transactions for each 
consumer; 

selecting for each consumer a set of the date ordered transactions to form a group of input transacUons for the 
consumer; and 

for each consumer, applying the input transactions of the consumer to each of a pluralrty of merchant segment 
predictive models, each merchant segment predictive model defining for a group of merchants a prediction 
function between input transactions in a past time interval and predicted spending in a subsequent time inter- 
val, to produce for each consumer a predicted spending amount in each merchant segment. 

2. The method of claim 1 , further comprising: 

for each consumer, associating the consumer with the merchant segment for which the consumer had the high- 
est predicting spending relative to other merchant segments. 

3. The method of claim 1 . further comprising: 

for each merchant segment, detemnining a segment vector as a summary vector of merchant vectors of mer- 
chants associated with the segment and 

for each consumer, associating the consumer with the merchant segment having the greatest dot product 
between the segment vector of the segment and a consumer vector of the consumer. 
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The method of claim 1 , further comprising: 
for each merchant segment 

ranking the consumers by their predicted spending in the merchant segment; 
deterrnining for each consumer a percentile ranking in the merchant segment; 

for each consumer: 

determining the me«:hant segment in which the consumer's percentile ranking Is the highest to unique^ 

associate each consumer with one merchant segment; 
tor each merchant segment. detem,ir,ing summary tmnsaction stattetics tor the consumers uniquely assod- 
ated with the merchant segment. 

The method of claim 1 , further comprising: 

for each merchant segment 

ranldng the consumers by their predicted spending in the merchant segment; 
determining for each consumer a percentile ranking in the merchant segment; 

seSg S a population, the consume-s having a percentile mnlcing in excess of predetem^ned percen- 
SetSing'iummary transaction statistics for selected population of consumers. 
The method of one of the preceding claims, further comprising: 

to co-occun-ences of each merchant in the transaction data. 
The method of claim 6. further comprising: 

updating the merchant vector of each merchant based upon an unexpected amount deviation in a frequency 
of co-oocun-ence of the merchant with other merchants. 

. The method of claim 6 or 7, further comprising: 

detemiining a co-occurrence frequency for each merchant with each other merchant in the ^-^nsa^on data; 
2e e^ n ng tor each pair of merchants, a relationsh,) strength between the pa.r of """^ 
muchTe d«ermined co-occurrence frequency deviates from an expected =<>J^"7"/^Va 3S'dot orod- 
tor each pair of merchant vectors, mapping the relationship strength into a vector space as a desired dot prod 
uct between respective merchant vectors the merchants in the pair; and „„^ua„t vectors aoorox- 

updating each merchant vectors so that the actual dot products between each pair of merchant vectors approx 
imates the desired dot product between the merchant vectors. 

3. The method of claim 8. wherein determining for each pair of merchants, a relatior^hip strength between the pair of 
merchants further comprises: 

determining the relationship strength by 



'i " a 



9 



where 
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fj is the relationship strength between menchanti and meichanlj in a pair of merchants; 

Tg is the actual co-occurrence frequency of meichantj and merchant, in the transaction data; and 

5 f,j is the expected co-occurrence frequency of merchant, and merchant in the transaction data. 

10. The method of daim B. wherein determining for each pair of merchants, a relationship strength between the pair of 
merchants further comprises: 

10 detemiining the relationship strength by 

rg=sign{Tg- f^)'j2^ 

IS where 

is the relationship strength between merchanti and merchant in a pair of merchants; 
X is a log-fikehood ratio; 

Tg is the actual co-occurrence frequency of merchant, and merchantj in the transaction data; and 
fij is the expected co-occurrence frequency of merchant, and merchant in the transaction data. 

25 11. The method of claim 8. wherein determining for each pair of merchants, a relationship strength between the pair of 

merchants further comprises: 

determining the relationship strength by 



20 



30 
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= sign{T, - f,) • ^211^/^ = sign(T, -f,) ■ ^/IkTI • 



where 



is the relationship strength between merchanti and merchantj in a pair of merchants: 

40 X 'isa log-likehood ratio: 

Tij is the actual co-occurrence frequency of merchant, and merchantj in the transaction data; and 
ffj is the expected co-occun-ence frequency of merchant, and merchantj in the transaction data. 
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12. The method of one of claims 8 to n. wherem updating each merchant vectors so that the a«"a clot predu^ 
between each pair of merchant vectors approx.mates the desired dot product between the ^^'^►^f^"' ^^Jf ^J' .^'^ 
prises a gradient descent update that updates tne merchant vectors according to whether the actual dot preduct 
between them is greater or lesser than the desired dot product. 

13 The method of one of claims 8 to 12. wherein updating each merchant vectors so that the actual dot products 
■ between each pair of merchant veaors approximates the desired dot preduct between the 
prises determining tor each merchant vector an error weighted average o. the desired posrt.ons of the merchant 
vector from current position of each other merchant vector and tne desired dot product between the merchant vec- 
ss tor and each other merchant vector. 

14. The method of claim 1 , further comprising: 
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determining for each merchant name in the transaction data a merchant vector. 

clustering the merchant vectors to form a plurality of merchant segments, wherein each merchant vector e 
associated with one and only one merchant segment; „oo.^iot»H ™»«.hantc nf 

tor each merchant segment, determining from the transactions of consumers at the associated merchants of 
the merchant, statistical measures of consumer transactions in the segment 

15. The method of one of the preceding claims, further comprising: 

selecting a plurality of consumers associated with at least one merchant segment the selected plurality 
selected according to their predicted spending in the merchant segment and 
providing promotional offers to the selected plurality of consumers. 

1 6. The method of one of the preceding claims, further comprising: 

training each of the merchant segment predictive models to predict spending in a predicted time period based 
upon transaction statistics of the consumer's transactions in a past time period. 

17 The method of claim 16. wherein the transaction statistics comprises variables describing the f *^ *^^^^^^ 

sumer-s transactions In one or more merchant segments, the frequency of the consumer's trar«ac^ions in one or 
more merchant segments, and the amount of the consumers transactions in one or more merchant segments. 

18. A system for predicting consumer financial behavior, comprising: 

a plurality of merchant segments, each merchant segment having a set of 
a SluraliS^ of merchant segment predicf^e models, each model associated wm, one °» 
for predteting spending by an Individual consumer in the merchant segment in a predK^ed time period as a 
function of transaction statistics of the consumer tor transactions in a prior time „ 
a data processing module that receives transaction data for a consumer, and construct the transaction stats- 
tics for the prior time period for input into selected ones of the merchant segment predictive models. 

19. A system for forming merchant segments, comprising: 

a data processing module that receives consumer transaction data for a plurality of consumer accounts, and 
organizes the transaction data by account and within account sequences *;["^= ^ . ^ 

a data processing module that determines from the sequenced transaction data an expected frequency of co- 
occurrence for each merchant, and that constmcts for each merchant a merchant vector as a function of unex- 
pected frequency of co-occun-ences of the merchant and ^ . „«„.hont«o,- 
a clustering module that clusters the merchant vectors into merchant segment by determining merchant vec- 
tors that are closely aligned with each other. 

20. A method for determining whether tvira strings are substantially the same, comprising: 

determining for each of a plurality of substrings a weight as a function of a frequency of the substring in a data 
defining for each substring an orthogonal unit vector, using the pluralrty of substrings as the number of dimen- 

£^each'Sf^^t:;lt^"n^^ to be compared, defining a vector which is the sum of the unit vectors of all sub- 

Strings in the string: 

determining the dot product of the string vectors for the two strings; and thro^hnin 
determining the two strings to be substantially the same it the dot product exceeds a predetermined threshold. 
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gAMPLg MERCH i^ffT ffPr^^ "^^^ 

^200 ^202 p2« ^206^208 

(1) : Direct Markeang: Housewares Appliances: SenloRCArWj^- 2it 

(2) : Retail: Matt: Sporting Goods and Entertainment: Young adult 

(18) : Travel: Tourist Golf: Traveler 

(19) : Retail: Department Stores: Furniture 

(20) : Retail: Mall: Oolhlng and Accessories: Male and Female 

(21) : Retail: Shoes: Furniture and Accessories 

(103) : Direct Mari<ellng: Sodal Services: Religion 

(104) : Retail: aotWng: Family: SE Pennsylvania 

(105) : Direct Mariating: Internet and Catalog: PCs: Adult 

(106) : Retail: Housewares and Utilities: Homeowners 

(107) : Retail: Auto: Housewares: Virginia 

(108) : Retail: Housewares: Homeowners: CA: MV: WA 

(173) : Retail: Computers: Sports: Student: RI 

(174) : Services: Rnandal: Caanos: Gamblers: 

(175) : Retail: Home and Accessories 

(176) : Education: Tuition: Books: Student: RI 

(206) : Retail: Direct Martoet: Catalog: Women Qothlng: Female 

(207) : Retail: Home Improvement Female 

(208) : Direct Mariceting: Catalog: Office SuppHes: Business Owners 

(209) : Retail: Department Stores: General Merch: Youth 

(210) : Retail: Furniture: Recreation: Student: CA 

(211) : Direct Martoeting: Catalog 

(212) : Retail: Sporting Goods: Tennis: Mate 

(253) : Retail: Books: Bectronics: Jewelry 

(254) : Recreation: Sports Fans: Hardware: Male: CA 

(255) : Direct Martoeting: Bectronics: Male 

(256) : Retail: Bectronics: OfRoe Supplies 

(257) : Retail: Bectronics 

(258) : RctaH: Yard and Garden: Automotive: Nv 

(299) : Retail: Household: Yarti and Garden: NV 

(300) : Direct Martoeting: Catalog: Music 

FIG. 2 
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date: 970510 8ic05943 $ 37.70 STAPLES #308 

date: 970510 8ie05311 $ 96.98 NORDSTROM JEWEL 

date: 970513 sic03066 $ 81.00 SOUTHWEST AIRUNES 

date: 970524 8ic03387 $ 95.27 ALAMO LAX 

date: 970526 sic0363B $ 128.43 BEVERLY HILLS HILTON 

date: 970616 8ic03000 $ 220.00 STAPLES #22S 

date: 970617 8ic03066 $ 194.00 SOUTHWEST AIRLINES 

date: 970623 8lc03700 $ 13.44 ALAMO SAN FRANCISCO 

date: 970629 sic07538 $ 41.25 SAN FRANCISCO HILTON 

date: 970703 8ic05311 $ 88.76 NORDSTROM MENS STORE 

FIG. 7a 
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FIG. 7b 
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